JavaScript Regex (Regular Expressions)


Regular expressions (Regex) are a powerful tool for pattern matching and text manipulation in JavaScript. They allow you to search, match, and replace patterns within strings with high flexibility. Whether you're validating email addresses, extracting phone numbers, or just performing simple text searches, understanding regular expressions is essential for any developer.

What is JavaScript Regex?

A regular expression (regex) is a sequence of characters that forms a search pattern. It can be used to match strings, extract data, and replace parts of strings. In JavaScript, regex is implemented using the RegExp object, or directly within certain methods like String.match(), String.replace(), String.search(), and String.split().

Basic Syntax of Regex

A regular expression is written between two delimiters, usually forward slashes (/), like this:

const regex = /pattern/;

Alternatively, you can create a regular expression using the RegExp constructor:

const regex = new RegExp('pattern');

Common Regex Patterns

Here are some basic components and patterns in regular expressions:

  • .: Matches any single character except newlines.
  • ^: Asserts the start of a string.
  • $: Asserts the end of a string.
  • []: Defines a character set (e.g., [aeiou] matches any vowel).
  • |: Acts as an OR operator (e.g., a|b matches 'a' or 'b').
  • \d: Matches any digit (equivalent to [0-9]).
  • \D: Matches any non-digit.
  • \w: Matches any word character (letters, digits, and underscores).
  • \W: Matches any non-word character.
  • \s: Matches any whitespace character (spaces, tabs, newlines).
  • \S: Matches any non-whitespace character.
  • *: Matches 0 or more occurrences of the preceding element.
  • +: Matches 1 or more occurrences of the preceding element.
  • ?: Matches 0 or 1 occurrence of the preceding element.
  • {n,m}: Matches between n and m occurrences of the preceding element.
  • (): Groups expressions together for capturing.

Using Regex in JavaScript

JavaScript provides several methods to work with regex patterns. Let's look at some of the most commonly used methods.

1. test() Method

The test() method is used to check if a string matches a given regular expression. It returns true if there's a match and false if there's not.

Example: Checking for a Match

const regex = /hello/;
console.log(regex.test("hello world"));  // true
console.log(regex.test("world hello"));  // true
console.log(regex.test("hi there"));     // false

2. match() Method

The match() method retrieves the matches of a regular expression in a string. It returns an array of matches or null if no matches are found.

Example: Extracting Matches

const text = "The quick brown fox";
const regex = /\b\w{5}\b/g;  // Match words of exactly 5 letters
console.log(text.match(regex));  // ["quick", "brown"]

3. replace() Method

The replace() method is used to replace matched substrings with another string. You can also use regular expressions with this method.

Example: Replacing Text

const str = "Hello world!";
const regex = /world/;
const newStr = str.replace(regex, "JavaScript");
console.log(newStr);  // "Hello JavaScript!"

4. search() Method

The search() method searches for a match and returns the index of the first match, or -1 if no match is found.

Example: Finding the Position of a Match

const text = "The rain in Spain falls mainly in the plain";
const regex = /Spain/;
console.log(text.search(regex));  // 12 (index of 'Spain')

5. split() Method

The split() method splits a string into an array of substrings based on a regex pattern.

Example: Splitting a String

const text = "apple,banana,cherry";
const regex = /,/;
console.log(text.split(regex));  // ["apple", "banana", "cherry"]

Common Use Cases of Regular Expressions

1. Validating Email Addresses

One common use of regular expressions is to validate email addresses. Here's an example of a simple regex pattern to check if an email is in the correct format:

const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/;
console.log(emailRegex.test("example@domain.com"));  // true
console.log(emailRegex.test("example@domain"));      // false

2. Validating Phone Numbers

You can also use regex to validate phone numbers. Here's a pattern that checks for various formats of phone numbers:

const phoneRegex = /^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/;
console.log(phoneRegex.test("(123) 456-7890"));  // true
console.log(phoneRegex.test("123-456-7890"));    // true
console.log(phoneRegex.test("1234567890"));      // true

3. Extracting Dates

If you want to extract date strings (e.g., "2024-12-31") from text, you can use a regex pattern that matches a date format.

const dateRegex = /\d{4}-\d{2}-\d{2}/;
const text = "Today's date is 2024-12-21.";
console.log(text.match(dateRegex));  // ["2024-12-21"]

4. Removing Extra Whitespace

You can use regex to clean up extra spaces or unwanted characters from a string.

const text = "  Hello   world!  ";
const cleanText = text.replace(/\s+/g, " ").trim();
console.log(cleanText);  // "Hello world!"

5. Extracting URLs

Here's how you can extract all URLs from a string using regex:

const text = "Visit https://example.com or http://another-example.com for more info.";
const urlRegex = /https?:\/\/[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}/g;
console.log(text.match(urlRegex));  // ["https://example.com", "http://another-example.com"]

Advanced Techniques

1. Lookahead and Lookbehind Assertions

Lookahead and lookbehind assertions allow you to match a pattern only if it is followed or preceded by another pattern, without including it in the match.

Example: Positive Lookahead

const regex = /\d+(?=\s+USD)/;
const text = "Price: 100 USD";
console.log(text.match(regex));  // ["100"]

Here:

  • The pattern \d+(?=\s+USD) matches a number followed by "USD", but "USD" is not included in the match.

Example: Lookbehind

const regex = /(?<=@)[a-zA-Z0-9.-]+/;
const email = "user@example.com";
console.log(email.match(regex));  // ["example.com"]

Here:

  • The pattern (?<=@) asserts that the match is preceded by "@" but does not include it in the result.

2. Non-Greedy Matching

By default, regex tries to match as much text as possible. However, sometimes you need it to match as little as possible. This is called non-greedy matching, and it is done using ? after quantifiers like *, +, or {n,m}.

Example: Non-Greedy Match

const regex = /<.*?>/g;
const text = "<div>content</div><p>another content</p>";
console.log(text.match(regex));  // ["<div>", "</div>", "<p>", "</p>"]

Here, .*? ensures the match is non-greedy, so it stops at the first closing tag instead of trying to match everything.

Performance Considerations

While regular expressions are powerful, they can also be computationally expensive if used carelessly. Some best practices include:

  • Avoid using overly broad or general patterns.
  • Test and optimize your regex for performance, especially if you're processing large datasets.
  • Use ^ (start of string) and $ (end of string) anchors when you know the position of the match in the string.