Regular Expressions Reference

Overview

A practical and portable regular expression reference for real-world use.
No flag syntax (e.g., (?s), (?m), (?i)) is used to ensure compatibility across OS and tools.
Only safe, environment-independent patterns are included.


Newline Codes and Platform Differences

OS Newline Code Recommended Pattern Key Point
Windows \r\n \r?\n Universal pattern for CRLF/LF
macOS / Linux \n \r?\n Compatible across all platforms

Key Notes

  • Use \r?\n to safely detect or replace newlines on Windows/macOS/Linux.
  • To match blocks including newlines, use (?:.|\r|\n) since . alone excludes line breaks.

Part 1: Basic Syntax Overview

1-1. Character Classes and Ranges

Syntax Meaning Example Match
. Any single char except newline a.c / abc abc
[abc] One of a, b, or c bag a
[^0-9] Any non-digit a1 a
[A-Z0-9] Uppercase letter or digit X8z X,8

1-2. Quantifiers

Syntax Meaning Example Match
* Zero or more goooogle goooo
+ One or more google oo
? Zero or one color colour Both
{n} Exactly n times a{3} / aaa aaa
{n,} n or more a{2,} / aaaa aaaa
{n,m} Between n and m a{2,4} / aaaaa aaaa

1-3. Anchors

Syntax Meaning Example Match
^abc Line start abc\nzabc abc (line 1)
abc$ Line end zabc\nabc Both line ends
\bword\b Word boundary word words word
\Bing Non-boundary ringing second ing

1-4. Shortcuts

Syntax Meaning Example Match
\d Digit [0-9] ver2.10 2,10
\D Non-digit a1 a
\w Alphanumeric + _ a_b-1 a_b
\W Non-alphanumeric a# #
\s Whitespace (tab/newline etc.) a b space
\S Non-whitespace a b a,b
\t Tab a\tb a[TAB]b

1-5. Escapes and Special Characters

Syntax Meaning Example Match
\. Literal dot a.c a.c
\* Asterisk a*b a*b
\+ Plus sign a+b a+b
\? Question mark what? what?
\( \) Parentheses (test) (test)
| Pipe `a b`
\\ Backslash C:\\path \\
\^ Caret ^abc ^abc
\$ Dollar total$ $
\[ \] Brackets [abc] [abc]
\{ \} Braces {a,b} {a,b}

Part 2: Grouping, Alternation, Lookaround

Syntax Purpose Example Match
(abc)+ Repeat as group abcabcx abcabc
`(?:jpg png)` Non-capturing OR file.png
`foo bar` OR condition bar
\d+(?=円) Numbers before “円” 合計100円 100
^(?!.*error).* Line not containing “error” ok\nerror ok
(?<=¥)\d+ Number after “¥” ¥300 300
(?<!Mr\.)\s[A-Z] Uppercase not after “Mr.” Ms. Alice A

Part 3: Matching Multi-line Text

Use Case Pattern Example Match
HTML block `<div>(?:. \r \n)*?</div>`
Log entry `^[\d{4}-\d{2}-\d{2} [\d:]+](?:. \r \n)*?(?=^[\d{4}-\d{2}-\d{2}
Markdown code block “ “`(?:. \r \n)*?“` “
Comment (/*…*/) `/*(?:. \r \n)*?*/`

Part 4: Pattern Library (Filtering / Extraction)

Purpose Pattern Example Match
Digits only ^\d+$ 123 123
Alphanumeric ^[A-Za-z0-9]+$ user01 user01
Email (simple) ^[\w.-]+@[\w.-]+\.[A-Za-z]{2,}$ a@b.com a@b.com
URL https?://[\w.-]+\.[A-Za-z]{2,}(/[\w./?=&%-]*)? https://ex.com/a https://ex.com/a
ISO date \d{4}-\d{2}-\d{2} 2025-10-30 2025-10-30
Intl. phone \+\d{1,3}[\s-]?\d{1,14} +81 90 1234 5678 full
Strong password ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$ Abcd1234 full
HTML comment `<!–(?:. \r \n)*?–>`
Lines without “error” ^(?!.*error).* ok\nerror ok

Part 5: Replacement Examples

Task Pattern Replace Input Output
Reverse words (\w+)\s+(\w+) ${2} ${1} John Doe Doe John
Change delimiter ; , a;b;c a,b,c
Normalize spaces \s{2,} a b a b
Remove HTML tags <[^>]+> <p>a</p> a
Trim spaces `^\s+ \s+$` a
Newline → space \r?\n a\nb a b
Remove end comment //.*$ x=1;//note x=1;
Unify date format (\d{4})/(\d{2})/(\d{2}) ${1}-${2}-${3} 2025/10/30 2025-10-30
Compress duplicate lines ^(.*)(\r?\n\1)+$ ${1} repeated lines single line

Summary

Regular expressions are a cross-platform and multilingual tool for text processing, scripting, log parsing, and data cleanup.
Using flag-independent, portable patterns ensures stability across diverse environments and editors.