Strings in Python


A string is a sequence of characters.

Creation:

To create strings in Python, we put the text inside single quotes or double quotes like this:

greeting = 'My name is Andy'
type(greeting) # Output: <class 'str'>

greeting = "My name is Andy"
type(greeting) # Output: <class 'str'>

'Hi' and "Hello" are called a string literals. A string literal, or string, is a series of zero or more characters enclosed in single or double quotes.

There are times where you have to choose between single or double quotes to not run into issues. For example, suppose we want to change the greeting to My name's Andy. So we will add an apostrophe, like this:

greeting = 'My name's Andy'

As you can see, we have a problem. Python sees the first apostrophe and recognizes that the string starts there. But then, it finds the next apostrophe and supposes the string ends there.

We can solve this by using double quotes to wrap up the text:

greeting = "My name's Andy"

It's similar if we want to have double quotes as part of the string's content:

greeting = "Hey "bro"" # doesn't work
greeting = 'Hey "bro"' # works

Escape character:

There is another way to solve these problems and that is by using the the escape character (\):

greeting = 'My name\'s Andy'

Getting the length:

We can get the length of a string using the len() function:

str = "Hello"
print(len(str)) # prints 5

Accessing characters:

To access the characters in a string, you use the list-like [] notation with the zero-based index:

str = "Hello"
print(str[0]) # Output: H
print(str[1]) # Output: e

Indices must be integers strictly less than len(str), otherwise you'll get IndexError: string index out of range.

It also works with negative indices like it does for lists:

str = "Hello"
print(str[-1]) # Output: o
print(str[-2]) # Output: l

Negative indices must be integers greater than or equal to -len(str), otherwise you'll get IndexError: string index out of range.


Concatenating strings via + operator:

To concatenate two or more strings, you use the + operator:

name = 'John'
str = 'Hello ' + name

print(str) # prints "Hello John"

If you want to assemble a string piece by piece, you can use the += operator:

greet = 'Welcome'
name = 'John'
greet += ' to AlgoCademy, '
greet += name

print(greet); # prints "Welcome to AlgoCademy, John"

Formatted strings:

Many times we will dynamically generate some text using variables. For example:

name = "Andy"
pet = "cat"

message = "Hey, " + name + "! Nice " + pet + "!"

print(message); # Output: Hey, Andy! Nice cat!

While this approach perfectly works, it's not ideal because as our text gets more complicated, it's harder for the reader to visualize all the concatenations in their head. This can be achieved much easier with formatted strings.

To define formatted strings, prefix your strings with an f and then use curly braces {} to dynamically insert values into your strings:

name = "Andy"
pet = "cat"

message = f"Hey, {name}! Nice {pet}!"

print(message); # Output: Hey, Andy! Nice cat!

With these curly braces, we're defining place holders or holes in our string, and when we run our program these holes will be filled with the value of our variables.

So here we have two place holders or two holes in our string. One is for the value of the name variable and the other is for the value of the pet variable.

String slicing

We often want to get a substring of a string. For this, we use slicing just like we did with arrays (str[startIndex:endIndex]):

str = 'JavaScript'
substr = str[2:6]

print(substr) # Output: "vaSc"

The startIndex is a zero-based index at which the slice start extraction.

The endIndex is also zero-based index before which the slice ends the extraction. The substr will not include the character at the endIndex index.

If you omit the endIndex, the slice assumes endIndex = len(str) by default and extracts to the end of the string:

str = 'JavaScript'
substr = str[4:] # equivalent to: substr = str[4:9]

print(substr) # Output: "Script"

If you omit the startIndex, the slice assumes startIndex = 0 by default and starts extraction from the first character:

str = 'JavaScript'
substr = str[:5] # equivalent to: substr = str[0:5]

print(substr) # Output: "JavaS"

If endIndex is beyond the end of the string, Python does not throw an error, it stops at the end instead:

str = 'JavaScript'
substr = str[2:50]

print(substr) # Output: "vaScript"

Try not to rely on this, out job as programmers is to make sure our indices are correct.

String methods:

upper(), lower(), find(), replace(), in operator, title()

String Immutability:

In Python, String values are immutable, which means that they cannot be altered once created.

For example, the following code:

myStr = "Bob"
myStr[0] = 'J'

cannot change the value of myStr to "Job", because the contents of myStr cannot be altered.

Note that this does not mean that myStr cannot be changed, just that the individual characters of a string literal cannot be changed. The only way to change myStr would be to assign it with a new string, like this:

myStr = "Bob"
myStr = "Job"
//or
myStr = 'J' + myStr[1:3]

The split function

The split() method splits (divides) a string into two or more substrings depending on a splitter (or divider). The splitter can be a single character, another string, or a regular expression.

After splitting the string into multiple substrings, the split() method puts them in an array and returns it. It doesn't make any modifications to the original string.

This splitter is provided as an argument in the split() function. For example:

str = "I am happy!"
myArr = str.split(" ")

print(myArr) # prints ['I', 'am', 'happy!']

Convert a string to an array of characters:

When we have to change characters often in a string, we want a way of doing this quickly. Because strings are immutable, we change a character by creating a whole new string, which is O(n).

But arrays are mutable and chaning an element in an array is O(1). So what we can do is first convert our string to an array of characters, operate on that array and than convert the array back to a string.

We can convert a string to an array using the list() method, like this:

str = "Andy"
myArr = list(str)

print(myArr) # prints ['A', 'n', 'd', 'y']
myArr[0] = 'a';
myArr[2] = 'D';

# We convert an array of chars to a string with join():
str = ''.join(myArr)

print(str) # prints "anDy"

Iterating through the characters:

We can iterate throught the characters of a string using a for loop and the in keyword like this:

myStr = "Andy"
for c in myStr:
    print(c)

# This will print the characters 'A', 'n', 'd' and 'y' on different lines

We can also iterate throught the elements of a string using indices and a for loop like this:

myStr = "Andy"
for i in range(len(myStr)):
    print(myStr[i])

# This will print the characters 'A', 'n', 'd' and 'y' on different lines

Both ways take O(n) time, where n is the length of the array we iterate on.


Assignment
Follow the Coding Tutorial and let's play with some arrays.


Hint
Look at the examples above if you get stuck.


Introduction

Strings are a fundamental data type in Python, representing sequences of characters. They are essential for handling text data, which is a common requirement in many programming tasks. Understanding how to manipulate strings is crucial for tasks such as data processing, user input handling, and generating dynamic content.

Understanding the Basics

Strings in Python can be created using single or double quotes. This flexibility allows you to include quotes within your strings without causing syntax errors. For example:

greeting = 'Hello, world!'
quote = "Albert Einstein once said, 'Imagination is more important than knowledge.'"

It's important to understand how to handle special characters within strings, which can be done using escape characters (e.g., \' for a single quote).

Main Concepts

Key concepts in string manipulation include:

Examples and Use Cases

Here are some examples demonstrating string manipulation:

# Example 1: Concatenation
first_name = "John"
last_name = "Doe"
full_name = first_name + " " + last_name
print(full_name)  # Output: John Doe

# Example 2: Formatted Strings
age = 30
message = f"Hello, {first_name}! You are {age} years old."
print(message)  # Output: Hello, John! You are 30 years old.

# Example 3: Slicing
text = "Hello, world!"
substring = text[7:12]
print(substring)  # Output: world

Common Pitfalls and Best Practices

Common mistakes include:

Best practices include:

Advanced Techniques

Advanced string manipulation techniques include:

import re

# Example: Using regular expressions
pattern = r'\b\w{5}\b'
text = "Hello world! This is a test."
matches = re.findall(pattern, text)
print(matches)  # Output: ['Hello', 'world']

Code Implementation

Here is a comprehensive example demonstrating various string operations:

# String creation
greeting = "Hello, world!"

# String length
length = len(greeting)
print(f"Length: {length}")  # Output: Length: 13

# Accessing characters
first_char = greeting[0]
last_char = greeting[-1]
print(f"First character: {first_char}, Last character: {last_char}")  # Output: First character: H, Last character: !

# Concatenation
name = "Alice"
welcome_message = greeting + " Welcome, " + name + "!"
print(welcome_message)  # Output: Hello, world! Welcome, Alice!

# Formatted strings
age = 25
formatted_message = f"{name} is {age} years old."
print(formatted_message)  # Output: Alice is 25 years old.

# Slicing
substring = greeting[7:12]
print(f"Substring: {substring}")  # Output: Substring: world

# String methods
upper_case = greeting.upper()
print(f"Upper case: {upper_case}")  # Output: Upper case: HELLO, WORLD!

# Split and join
words = greeting.split()
joined = ' '.join(words)
print(f"Words: {words}, Joined: {joined}")  # Output: Words: ['Hello,', 'world!'], Joined: Hello, world!

Debugging and Testing

When debugging string-related code, consider the following tips:

# Example: Debugging with assertions
def greet(name):
    return f"Hello, {name}!"

assert greet("Alice") == "Hello, Alice!"
assert greet("Bob") == "Hello, Bob!"