C++ String Slicing with substr(): The LENGTH Gotcha You Need to Know

C++'s std::string::substr() is the standard way to extract substrings — used constantly in text parsing, log processing, protocol implementations, and algorithm problems. It has one critical difference from Python slicing and Java's substring(): the second parameter is a length, not an end index. Getting this wrong produces silent bugs that are hard to track down.

The Syntax: Position and Length

// substr(startPos, length) — length optional, defaults to end of string
std::string s = "JavaScript";
//               J  a  v  a  S  c  r  i  p  t
//               0  1  2  3  4  5  6  7  8  9

std::cout << s.substr(2, 4) << "\n";   // "vaSc"   — start at 2, take 4 chars
std::cout << s.substr(4)    << "\n";   // "Script" — from 4 to end
std::cout << s.substr(0, 4) << "\n";   // "Java"   — first 4 characters

Critical Difference: Length vs End Index

This is the #1 confusion with substr(). The second parameter is how many characters to take — not where to stop:

std::string s = "JavaScript";

// To extract "Script" (indices 4 through 9):
// Python:  s[4:10]             — second param is end index
// Java:    s.substring(4, 10) — second param is end index
// C++:     s.substr(4, 6)     — second param is LENGTH (10 - 4 = 6)

std::cout << s.substr(4, 6) << "\n";   // "Script" ✓

// Common mistake — passing end index as in Python/Java:
// s.substr(4, 10) would try to take 10 chars from position 4 — wrong!

Formula: To mimic s[start:end], write s.substr(start, end - start).

Getting the Last N Characters

C++ has no negative indices. To extract from the end, use size():

std::string s = "JavaScript";

// Last 6 characters (Python: s[-6:])
std::cout << s.substr(s.size() - 6) << "\n";       // "Script"

// All but the last 4 characters (Python: s[:-4])
std::cout << s.substr(0, s.size() - 4) << "\n";    // "JavaSc"

// Last character:
std::cout << s.back() << "\n";                      // 't'

Exception: std::out_of_range

If startPos exceeds the string's length, substr() throws std::out_of_range. A length that exceeds available characters is safe — it just clips to end of string:

std::string s = "Hi";

try {
    std::cout << s.substr(0, 100) << "\n";   // "Hi"  — length clipped, no error
    std::cout << s.substr(5)      << "\n";   // throws! startPos > size()
} catch (const std::out_of_range& e) {
    std::cout << "Error: " << e.what() << "\n";
}

Dynamic Slicing with find()

Combine find() with substr() for runtime boundaries. Always check for std::string::npos — returned when find() finds nothing:

// Extract domain from email:
std::string email = "user@example.com";
size_t atPos = email.find('@');
if (atPos != std::string::npos) {
    std::string domain = email.substr(atPos + 1);
    std::cout << domain << "\n";   // "example.com"
}

// Extract file extension (search from the right with rfind):
std::string filename = "report_2024.pdf";
size_t dotPos = filename.rfind('.');
if (dotPos != std::string::npos) {
    std::string ext = filename.substr(dotPos + 1);
    std::cout << ext << "\n";       // "pdf"
}

Python-Style Slice Equivalent in C++

// Helper: Python-style slice with negative index support
std::string pySlice(const std::string& s, int start, int end) {
    if (start < 0) start = std::max(0, (int)s.size() + start);
    if (end   < 0) end   = std::max(0, (int)s.size() + end);
    end = std::min(end, (int)s.size());
    if (start >= end) return "";
    return s.substr(start, end - start);
}

std::string s = "JavaScript";
std::cout << pySlice(s, 2, 6)   << "\n";  // "vaSc"
std::cout << pySlice(s, -6, -2) << "\n";  // "Scri" — negative indices work

Key Takeaways

  • substr(pos, len): second parameter is a length, not an end index — this is the #1 gotcha.
  • To mimic s[start:end], write s.substr(start, end - start).
  • No negative indices — use s.size() - n to count from the end.
  • Out-of-range startPos throws std::out_of_range; excess length clips silently.
  • Always check find() result against std::string::npos before passing to substr().

What to Learn Next

With substr() mastered, explore the full C++ string API: find(), rfind(), and find_first_of() for locating positions, replace() for in-place substitution, and std::string_view (C++17) for zero-copy substring references — essential in performance-critical code. In algorithm problems, substr() is fundamental to sliding window, palindrome checking, and string hashing problems.