JavaScript Regex (Regular Expressions)
Regular expressions (Regex) are a powerful tool for pattern matching and text manipulation in JavaScript. They allow you to search, match, and replace patterns within strings with high flexibility. Whether you're validating email addresses, extracting phone numbers, or just performing simple text searches, understanding regular expressions is essential for any developer.
A regular expression (regex) is a sequence of characters that forms a search pattern. It can be used to match strings, extract data, and replace parts of strings. In JavaScript, regex is implemented using the RegExp
object, or directly within certain methods like String.match()
, String.replace()
, String.search()
, and String.split()
.
A regular expression is written between two delimiters, usually forward slashes (/
), like this:
const regex = /pattern/;
Alternatively, you can create a regular expression using the RegExp
constructor:
const regex = new RegExp('pattern');
Here are some basic components and patterns in regular expressions:
.
: Matches any single character except newlines.^
: Asserts the start of a string.$
: Asserts the end of a string.[]
: Defines a character set (e.g., [aeiou]
matches any vowel).|
: Acts as an OR operator (e.g., a|b
matches 'a' or 'b').\d
: Matches any digit (equivalent to [0-9]
).\D
: Matches any non-digit.\w
: Matches any word character (letters, digits, and underscores).\W
: Matches any non-word character.\s
: Matches any whitespace character (spaces, tabs, newlines).\S
: Matches any non-whitespace character.*
: Matches 0 or more occurrences of the preceding element.+
: Matches 1 or more occurrences of the preceding element.?
: Matches 0 or 1 occurrence of the preceding element.{n,m}
: Matches between n
and m
occurrences of the preceding element.()
: Groups expressions together for capturing.JavaScript provides several methods to work with regex patterns. Let's look at some of the most commonly used methods.
test()
MethodThe test()
method is used to check if a string matches a given regular expression. It returns true
if there's a match and false
if there's not.
const regex = /hello/;
console.log(regex.test("hello world")); // true
console.log(regex.test("world hello")); // true
console.log(regex.test("hi there")); // false
match()
MethodThe match()
method retrieves the matches of a regular expression in a string. It returns an array of matches or null
if no matches are found.
const text = "The quick brown fox";
const regex = /\b\w{5}\b/g; // Match words of exactly 5 letters
console.log(text.match(regex)); // ["quick", "brown"]
replace()
MethodThe replace()
method is used to replace matched substrings with another string. You can also use regular expressions with this method.
const str = "Hello world!";
const regex = /world/;
const newStr = str.replace(regex, "JavaScript");
console.log(newStr); // "Hello JavaScript!"
search()
MethodThe search()
method searches for a match and returns the index of the first match, or -1
if no match is found.
const text = "The rain in Spain falls mainly in the plain";
const regex = /Spain/;
console.log(text.search(regex)); // 12 (index of 'Spain')
split()
MethodThe split()
method splits a string into an array of substrings based on a regex pattern.
const text = "apple,banana,cherry";
const regex = /,/;
console.log(text.split(regex)); // ["apple", "banana", "cherry"]
One common use of regular expressions is to validate email addresses. Here's an example of a simple regex pattern to check if an email is in the correct format:
const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/;
console.log(emailRegex.test("example@domain.com")); // true
console.log(emailRegex.test("example@domain")); // false
You can also use regex to validate phone numbers. Here's a pattern that checks for various formats of phone numbers:
const phoneRegex = /^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/;
console.log(phoneRegex.test("(123) 456-7890")); // true
console.log(phoneRegex.test("123-456-7890")); // true
console.log(phoneRegex.test("1234567890")); // true
If you want to extract date strings (e.g., "2024-12-31") from text, you can use a regex pattern that matches a date format.
const dateRegex = /\d{4}-\d{2}-\d{2}/;
const text = "Today's date is 2024-12-21.";
console.log(text.match(dateRegex)); // ["2024-12-21"]
You can use regex to clean up extra spaces or unwanted characters from a string.
const text = " Hello world! ";
const cleanText = text.replace(/\s+/g, " ").trim();
console.log(cleanText); // "Hello world!"
Here's how you can extract all URLs from a string using regex:
const text = "Visit https://example.com or http://another-example.com for more info.";
const urlRegex = /https?:\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}/g;
console.log(text.match(urlRegex)); // ["https://example.com", "http://another-example.com"]
Lookahead and lookbehind assertions allow you to match a pattern only if it is followed or preceded by another pattern, without including it in the match.
const regex = /\d+(?=\s+USD)/;
const text = "Price: 100 USD";
console.log(text.match(regex)); // ["100"]
Here:
\d+(?=\s+USD)
matches a number followed by "USD", but "USD" is not included in the match.
const regex = /(?<=@)[a-zA-Z0-9.-]+/;
const email = "user@example.com";
console.log(email.match(regex)); // ["example.com"]
Here:
(?<=@)
asserts that the match is preceded by "@" but does not include it in the result.By default, regex tries to match as much text as possible. However, sometimes you need it to match as little as possible. This is called non-greedy matching, and it is done using ?
after quantifiers like *
, +
, or {n,m}
.
const regex = /<.*?>/g;
const text = "<div>content</div><p>another content</p>";
console.log(text.match(regex)); // ["<div>", "</div>", "<p>", "</p>"]
Here, .*?
ensures the match is non-greedy, so it stops at the first closing tag instead of trying to match everything.
While regular expressions are powerful, they can also be computationally expensive if used carelessly. Some best practices include:
^
(start of string) and $
(end of string) anchors when you know the position of the match in the string.