Update: A beta version of XRegExp 0.3 is now available as part of the RegexPal download package.

JavaScript's regular expression flavor doesn't support named capture. Well, says who? XRegExp 0.2 brings named capture support, along with several other new features. But first of all, if you haven't seen the previous version, make sure to check out my post on XRegExp 0.1, because not all of the documentation is repeated below.

Highlights

  • Comprehensive named capture support (New)
  • Supports regex literals through the addFlags method (New)
  • Free-spacing and comments mode (x)
  • Dot matches all mode (s)
  • Several other minor improvements over v0.1

Named capture

There are several different syntaxes in the wild for named capture. I've compiled the following table based on my understanding of the regex support of the libraries in question. XRegExp's syntax is included at the top.

Library Capture Backreference In replacement Stored at
XRegExp (<name>…) \k<name> ${name} result.name
.NET (?<name>…)
(?'name'…)
\k<name>
\k'name'
${name} Matcher.Groups('name')
Perl 5.10 (beta) (?<name>…)
(?'name'…)
\k<name>
\k'name'
\g{name}
$ {name} ??
Python (?P<name>…) (?P=name) \g<name> result.group('name')
PHP preg (PCRE) (.NET, Perl, and Python styles) $regs['name'] $result['name']

No other major regex library currently supports named capture, although the JGsoft engine (used by products like RegexBuddy) supports both .NET and Python syntax. XRegExp does not use a question mark at the beginning of a named capturing group because that would prevent it from being used in regex literals (JavaScript would immediately throw an "invalid quantifier" error).

XRegExp supports named capture on an on-request basis. You can add named capture support to any regex though the use of the new "k" flag. This is done for compatibility reasons and to ensure that regex compilation time remains as fast as possible in all situations.

Following are several examples of using named capture:

// Add named capture support using the XRegExp constructor
var repeatedWords = new XRegExp("\\b (<word> \\w  ) \\s  \\k<word> \\b", "gixk");

// Add named capture support using RegExp, after overriding the native constructor
XRegExp.overrideNative();
var repeatedWords = new RegExp("\\b (<word> \\w  ) \\s  \\k<word> \\b", "gixk");

// Add named capture support to a regex literal
var repeatedWords = /\b (<word> \w  ) \s  \k<word> \b/.addFlags("gixk");

var data = "The the test data.";

// Check if data contains repeated words
var hasDuplicates = repeatedWords.test(data);
// hasDuplicates: true

// Use the regex to remove repeated words
var output = data.replace(repeatedWords, "${word}");
// output: "The test data."

In the above code, I've also used the x flag provided by XRegExp, to improve readability. Note that the addFlags method can be called multiple times on the same regex (e.g., /pattern/g.addFlags("k").addFlags("s")), but I'd recommend adding all flags in one shot, for efficiency.

Here are a few more examples of using named capture, with an overly simplistic URL-matching regex (for comprehensive URL parsing, see parseUri):

var url = "http://microsoft.com/path/to/file?q=1";
var urlParser = new XRegExp("^(<protocol>[^:/?] )://(<host>[^/?]*)(<path>[^?]*)\\?(<query>.*)", "k");
var parts = urlParser.exec(url);
/* The result:
parts.protocol: "http"
parts.host: "microsoft.com"
parts.path: "/path/to/file"
parts.query: "q=1" */

// Named backreferences are also available in replace() callback functions as properties of the first argument
var newUrl = url.replace(urlParser, function(match){
	return match.replace(match.host, "yahoo.com");
});
// newUrl: "http://yahoo.com/path/to/file?q=1"

Note that XRegExp's named capture functionality does not support deprecated JavaScript features including the lastMatch property of the global RegExp object and the RegExp.prototype.compile() method.

Singleline (s) and extended (x) modes

The other non-native flags XRegExp supports are s (singleline) for "dot matches all" mode, and x (extended) for "free-spacing and comments" mode. For full details about these modifiers, see the FAQ in my XRegExp 0.1 post. However, one difference from the previous version is that XRegExp 0.2, when using the x flag, now allows whitespace between a regex token and its quantifier (quantifiers are, e.g., , *?, or {1,3}). Although the previous version's handling/limitation in this regard was documented, it was atypical compared to other regex libraries. This has been fixed.

The code

/* XRegExp 0.2.2; MIT License
By Steven Levithan <http://stevenlevithan.com>
----------
Adds support for the following regular expression features:
- Free-spacing and comments ("x" flag)
- Dot matches all ("s" flag)
- Named capture ("k" flag)
 - Capture: (<name>...)
 - Backreference: \k<name>
 - In replacement: ${name}
 - Stored at: result.name
*/

/* Protect this from running more than once, which would break its references to native functions */
if (window.XRegExp === undefined) {
	var XRegExp;
	
	(function () {
		var native = {
			RegExp: RegExp,
			exec: RegExp.prototype.exec,
			match: String.prototype.match,
			replace: String.prototype.replace
		};
		
		XRegExp = function (pattern, flags) {
			return native.RegExp(pattern).addFlags(flags);
		};
		
		RegExp.prototype.addFlags = function (flags) {
			var pattern = this.source,
				useNamedCapture = false,
				re = XRegExp._re;
			
			flags = (flags || "")   native.replace.call(this.toString(), /^[\S\s] \//, "");
			
			if (flags.indexOf("x") > -1) {
				pattern = native.replace.call(pattern, re.extended, function ($0, $1, $2) {
					return $1 ? ($2 ? $2 : "(?:)") : $0;
				});
			}
			
			if (flags.indexOf("k") > -1) {
				var captureNames = [];
				pattern = native.replace.call(pattern, re.capturingGroup, function ($0, $1) {
					if (/^\((?!\?)/.test($0)) {
						if ($1) useNamedCapture = true;
						captureNames.push($1 || null);
						return "(";
					} else {
						return $0;
					}
				});
				if (useNamedCapture) {
					/* Replace named with numbered backreferences */
					pattern = native.replace.call(pattern, re.namedBackreference, function ($0, $1, $2) {
						var index = $1 ? captureNames.indexOf($1) : -1;
						return index > -1 ? "\\"   (index   1).toString()   ($2 ? "(?:)"   $2 : "") : $0;
					});
				}
			}
			
			/* If "]" is the leading character in a character class, replace it with "\]" for consistent
			cross-browser handling. This is needed to maintain correctness without the aid of browser sniffing
			when constructing the regexes which deal with character classes. They treat a leading "]" within a
			character class as a non-terminating, literal character, which is consistent with IE, .NET, Perl,
			PCRE, Python, Ruby, JGsoft, and most other regex engines. */
			pattern = native.replace.call(pattern, re.characterClass, function ($0, $1) {
				/* This second regex is only run when a leading "]" exists in the character class */
				return $1 ? native.replace.call($0, /^(\[\^?)]/, "$1\\]") : $0;
			});
			
			if (flags.indexOf("s") > -1) {
				pattern = native.replace.call(pattern, re.singleline, function ($0) {
					return $0 === "." ? "[\\S\\s]" : $0;
				});
			}
			
			var regex = native.RegExp(pattern, native.replace.call(flags, /[sxk] /g, ""));
			
			if (useNamedCapture) {
				regex._captureNames = captureNames;
			/* Preserve capture names if adding flags to a regex which has already run through addFlags("k") */
			} else if (this._captureNames) {
				regex._captureNames = this._captureNames.valueOf();
			}
			
			return regex;
		};
		
		String.prototype.replace = function (search, replacement) {
			/* If search is not a regex which uses named capturing groups, just run the native replace method */
			if (!(search instanceof native.RegExp && search._captureNames)) {
				return native.replace.apply(this, arguments);
			}
			
			if (typeof replacement === "function") {
				return native.replace.call(this, search, function () {
					/* Convert arguments[0] from a string primitive to a string object which can store properties */
					arguments[0] = new String(arguments[0]);
					/* Store named backreferences on the first argument before calling replacement */
					for (var i = 0; i < search._captureNames.length; i  ) {
						if (search._captureNames[i]) arguments[0][search._captureNames[i]] = arguments[i   1];
					}
					return replacement.apply(window, arguments);
				});
			} else {
				return native.replace.call(this, search, function () {
					var args = arguments;
					return native.replace.call(replacement, XRegExp._re.replacementVariable, function ($0, $1, $2) {
						/* Numbered backreference or special variable */
						if ($1) {
							switch ($1) {
								case "$": return "$";
								case "&": return args[0];
								case "`": return args[args.length - 1].substring(0, args[args.length - 2]);
								case "'": return args[args.length - 1].substring(args[args.length - 2]   args[0].length);
								/* Numbered backreference */
								default:
									/* What does "$10" mean?
									- Backreference 10, if at least 10 capturing groups exist
									- Backreference 1 followed by "0", if at least one capturing group exists
									- Else, it's the string "$10" */
									var literalNumbers = "";
									$1 =  $1; /* Cheap type-conversion */
									while ($1 > search._captureNames.length) {
										literalNumbers = $1.toString().match(/\d$/)[0]   literalNumbers;
										$1 = Math.floor($1 / 10); /* Drop the last digit */
									}
									return ($1 ? args[$1] : "$")   literalNumbers;
							}
						/* Named backreference */
						} else if ($2) {
							/* What does "${name}" mean?
							- Backreference to named capture "name", if it exists
							- Else, it's the string "${name}" */
							var index = search._captureNames.indexOf($2);
							return index > -1 ? args[index   1] : $0;
						} else {
							return $0;
						}
					});
				});
			}
		};
		
		RegExp.prototype.exec = function (str) {
			var result = native.exec.call(this, str);
			if (!(this._captureNames && result && result.length > 1)) return result;
			
			for (var i = 1; i < result.length; i  ) {
				var name = this._captureNames[i - 1];
				if (name) result[name] = result[i];
			}
			
			return result;
		};
		
		String.prototype.match = function (regexp) {
			if (!regexp._captureNames || regexp.global) return native.match.call(this, regexp);
			return regexp.exec(this);
		};
	})();
}

/* Regex syntax parsing with support for escapings, character classes, and various other context and cross-browser issues */
XRegExp._re = {
	extended: /(?:[^[#\s\\] |\\(?:[\S\s]|$)|\[\^?]?(?:[^\\\]] |\\(?:[\S\s]|$))*]?) |(\s*#[^\n\r]*\s*|\s )([?* ]|{\d (?:,\d*)?})?/g,
	singleline: /(?:[^[\\.] |\\(?:[\S\s]|$)|\[\^?]?(?:[^\\\]] |\\(?:[\S\s]|$))*]?) |\./g,
	characterClass: /(?:[^\\[] |\\(?:[\S\s]|$)) |\[\^?(]?)(?:[^\\\]] |\\(?:[\S\s]|$))*]?/g,
	capturingGroup: /(?:[^[(\\] |\\(?:[\S\s]|$)|\[\^?]?(?:[^\\\]] |\\(?:[\S\s]|$))*]?|\((?=\?)) |\((?:<([$\w] )>)?/g,
	namedBackreference: /(?:[^\\[] |\\(?:[^k]|$)|\[\^?]?(?:[^\\\]] |\\(?:[\S\s]|$))*]?|\\k(?!<[$\w] >)) |\\k<([$\w] )>(\d*)/g,
	replacementVariable: /(?:[^$] |\$(?![1-9$&`']|{[$\w] })) |\$(?:([1-9]\d*|[$&`'])|{([$\w] )})/g
};

XRegExp.overrideNative = function () {
	/* Override the global RegExp constructor/object with the XRegExp constructor. This precludes accessing
	properties of the last match via the global RegExp object. However, those properties are deprecated as
	of JavaScript 1.5, and the values are available on RegExp instances or via RegExp/String methods. It also
	affects the result of (/x/.constructor == RegExp) and (/x/ instanceof RegExp), so use with caution. */
	RegExp = XRegExp;
};

/* indexOf method from Mootools 1.11; MIT License */
Array.prototype.indexOf = Array.prototype.indexOf || function (item, from) {
	var len = this.length;
	for (var i = (from < 0) ? Math.max(0, len   from) : from || 0; i < len; i  ) {
		if (this[i] === item) return i;
	}
	return -1;
};

You can download it, or get the packed version (2.7 KB).

XRegExp has been tested in IE 5.5–7, Firefox 2.0.0.4, Opera 9.21, Safari 3.0.2 beta for Windows, and Swift 0.2.

Finally, note that the XRE object from v0.1 has been removed. XRegExp now only creates one global variable: XRegExp. To permanently override the native RegExp constructor/object, you can now run XRegExp.overrideNative();

XRegExp 0.2: Now With Named Capture的更多相关文章

  1. Android仿google now效果的呼吸按钮

    这篇文章主要为大家详细介绍了Android仿google now效果的呼吸按钮简单实现,具有一定的参考价值,感兴趣的小伙伴们可以参考一下

  2. laravel ORM关联关系中的 with和whereHas用法

    今天小编就为大家分享一篇laravel ORM关联关系中的 with和whereHas用法,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧

  3. Kotlin示例讲解标准函数with与run和apply的使用

    Kotlin的标准函数是指 Standard.kt 文件中定义的函数,任何Kotlin代码都可以自由地调用所有的标准函数。文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习吧

  4. vue路由警告:Duplicate named routes definition问题

    这篇文章主要介绍了vue路由警告:Duplicate named routes definition问题及解决方案,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教

  5. XRegExp 0.2: Now With Named Capture

    XRegExp0.2bringsnamedcapturesupport,alongwithseveralothernewfeatures.Butfirstofall,ifyouhaven'tseenthepreviousversion,makesuretocheckoutmypostonXRegExp0.1,becausenotallofthedocumentationisrepeatedbelow.Highlights. Comprehensivenamedcapturesupport(New). Su

  6. javascript中的遍历for in 以及with的用法

    这篇文章主要介绍了javascript中的类初始化,遍历for in 以及with的用法,需要的朋友可以参考下

  7. Python中的 No Module named ***问题及解决

    这篇文章主要介绍了Python中的 No Module named ***问题及解决方案,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教

  8. 报错No module named numpy问题的解决办法

    之前安装了Python,后来因为练习使用Python写科学计算的东西,又安装了Anaconda,但是安装Anaconda之后又出现了一个问题,下面这篇文章主要给大家介绍了关于报错No module named numpy问题的解决办法,需要的朋友可以参考下

  9. 在laravel中使用with实现动态添加where条件

    今天小编就为大家分享一篇在laravel中使用with实现动态添加where条件,具有好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧

  10. 关于报错IDEA Terminated with exit code 1的解决方法

    如果在IDEA构建项目时遇到下面这样的报错IDEA Terminated with exit code 1,那必然是Maven的设置参数重置了,导致下载错误引起的,本文给大家分享两种解决方法,需要的朋友可以参考下

随机推荐

  1. js中‘!.’是什么意思

  2. Vue如何指定不编译的文件夹和favicon.ico

    这篇文章主要介绍了Vue如何指定不编译的文件夹和favicon.ico,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教

  3. 基于JavaScript编写一个图片转PDF转换器

    本文为大家介绍了一个简单的 JavaScript 项目,可以将图片转换为 PDF 文件。你可以从本地选择任何一张图片,只需点击一下即可将其转换为 PDF 文件,感兴趣的可以动手尝试一下

  4. jquery点赞功能实现代码 点个赞吧!

    点赞功能很多地方都会出现,如何实现爱心点赞功能,这篇文章主要为大家详细介绍了jquery点赞功能实现代码,具有一定的参考价值,感兴趣的小伙伴们可以参考一下

  5. AngularJs上传前预览图片的实例代码

    使用AngularJs进行开发,在项目中,经常会遇到上传图片后,需在一旁预览图片内容,怎么实现这样的功能呢?今天小编给大家分享AugularJs上传前预览图片的实现代码,需要的朋友参考下吧

  6. JavaScript面向对象编程入门教程

    这篇文章主要介绍了JavaScript面向对象编程的相关概念,例如类、对象、属性、方法等面向对象的术语,并以实例讲解各种术语的使用,非常好的一篇面向对象入门教程,其它语言也可以参考哦

  7. jQuery中的通配符选择器使用总结

    通配符在控制input标签时相当好用,这里简单进行了jQuery中的通配符选择器使用总结,需要的朋友可以参考下

  8. javascript 动态调整图片尺寸实现代码

    在自己的网站上更新文章时一个比较常见的问题是:文章插图太宽,使整个网页都变形了。如果对每个插图都先进行缩放再插入的话,太麻烦了。

  9. jquery ajaxfileupload异步上传插件

    这篇文章主要为大家详细介绍了jquery ajaxfileupload异步上传插件,具有一定的参考价值,感兴趣的小伙伴们可以参考一下

  10. React学习之受控组件与数据共享实例分析

    这篇文章主要介绍了React学习之受控组件与数据共享,结合实例形式分析了React受控组件与组件间数据共享相关原理与使用技巧,需要的朋友可以参考下

返回
顶部