Strings in Python
Getting Started
When it comes to working with text in Python, strings are our bread and butter. We'll cover almost everything you need to know about strings, from creating them to using built-in methods for slicing, indexing, and more.
Definition-wise, strings represent an immutable sequence of characters, and we use them to store text-based information. In a similar fashion to other programming languages, Python uses Unicode to represent characters.
print("Welcome to Web Reference")
# Output: Welcome to Web Reference!
However, Python doesn't have a separate data type for a single character. Instead, a single character is essentially a string with a length of 1. To showcase the claim, we can access individual characters of a string using square brackets, just like we would with an array:
my_string = "Hello, World!"
print(my_string[0]) # Output: H
Note that the indexing of a string begins at 0, which means the first character is at index 0, the second character is at index 1, and so on.
Since strings are iterable, we can also loop over them using a for loop.
my_string = "Python is awesome!"
for char in my_string:
print(char)
Learn about the basics of loops in Python here.
Creating Strings
To create a string in Python, we simply surround the text with quotes. We can use either single quotes or double quotes, depending on our personal preference. Either one is fine as long as we remain consistent throughout the code.
single_quotes = 'This is a string with single quotes.'
double_quotes = "This is a string with double quotes."
Additionally, we enclose strings in triple quotes when we need to create text that spans multiple lines, like so:
multi_line_string = '''This is a string
that spans
multiple lines.'''
print(multi_line_string)
# Output:
This is a string
that spans
multiple lines.
We cover some additional context on triple quotes in our Comments in Python article.
Sometimes, we may need to include special characters in our string. To include them, we use escape sequences, which start with a backslash () followed by a special character.
Let's illustrate a few use cases:
# Using the escape sequence for a single quote
my_string = "She said, \"Hello!\""
print(my_string) # Output: She said, "Hello!"
# Using the escape sequence for a newline and a tab
my_string = 'First line.\nSecond line.\n\tIndented line.'
print(my_string)
# Output:
First line.
Second line.
Indented line.
The len()
function and String Checking
The len()
function returns the number of characters in a string. Logically, this can be useful when we need to determine the length of a string before performing some other operation on it.
my_string = "Python is awesome!"
print(len(my_string)) # Output: 18
Moreover, we can check and search for a specific substring in a larger string using the in
and not in
keywords.
sentence = "The quick brown fox jumps over the lazy dog"
print("fox" in sentence) # Output: True
if "cat" not in sentence:
print("No cats present here")
# Output: No cats present here
Modifying Strings
Even though strings are immutable, we can create copies as modified versions of the original ones. For instance, we can convert a string to both uppercase or lowercase by leveraging the upper()
and lower()
functions respectively.
my_string = "Python is awesome!"
print(my_string.upper()) # Output: PYTHON IS AWESOME!
print(my_string.lower()) # Output: python is awesome!
Also, we can replace a part of the original string and insert a new substring with the replace()
method.
my_string = 'Python is awesome!'
new_string = my_string.replace('awesome', 'incredible')
print(new_string) # Output: 'Python is incredible!'
Or, we can split it into a list of substrings with the split()
method.
my_string = 'This is a long string with several words'
words = my_string.split(' ')
print(words) # Output: ['This', 'is', 'a', 'long', 'string', 'with', 'several', 'words']
String Concatenation
By concatenation, we refer to the process of combining two or more strings together. In Python, we can use the +
operator to glue strings together and introduce some interesting behavior as seen in the examples below.
greeting = "Hello"
name = "Alice"
message = greeting + ", " + name + "!"
print(message) # Output: Hello, Alice!
String Slicing
We can extract a portion of a string with the square bracket notation. Inside it, we need to specify a starting and ending index. The key thing to remember is that the character at the start index position is included in the slice, which isn't the case with the end index.
my_string = "This is a long string with several words"
substring = my_string[8:18]
print(substring) # Output: a long str
We can also use negative indexing to slice a string from the end.
my_string = "This is a long string with several words"
substring = my_string[-6:-1]
print(substring) # Output: word
String Formatting
Depending on our needs, there are several ways to approach string formatting, especially when dealing with mixed data types.
With the %
operator, we can use placeholders that are replaced by values at runtime.
name = "Bob"
age = 30
message = "My name is %s and I am %d years old." % (name, age)
print(message) # Output: My name is Bob and I am 30 years old.
The %s
and %d
format specifiers indicate where string and integer values should be inserted, respectively. We use %s
to insert a string value, and %d
to insert the integer. In this specific case, name
is a string variable, so we pass it to %s
, while age
is an integer variable, so it is passed to %d
. When the %
operator is applied to the string with the tuple (name, age)
as arguments, the format specifiers are replaced with the corresponding values as we can see in the output comment.
Another approach would be to use f-strings, introduced in Python 3.6.
name = "Bob"
age = 30
message = f"My name is {name} and I am {age} years old."
print(message) # Output: My name is Bob and I am 30 years old.
To achieve a similar outcome, we can also use the format()
method to combine strings and numbers.
name = "Bob"
age = 30
message = "My name is {} and I am {} years old.".format(name, age)
print(message) # Output: My name is Bob and I am 30 years old.
Built-in String Methods
Python provides a number of built-in string methods for working with strings. For clarity purposes, we'll provide a table with some of the most commonly used methods:
Method | Description |
---|---|
capitalize() | Capitalizes the first letter of a string |
count(substring) | Returns the number of occurrences of a substring in a string |
endswith(suffix) | Returns True if a string ends with a specified suffix |
find(substring) | Returns the index of the first occurrence of a substring in a string. Returns -1 if the substring is not found |
isalnum() | Returns True if a string contains only alphanumeric characters |
isalpha() | Returns True if a string contains only alphabetic characters |
isdigit() | Returns True if a string contains only digits. |
join(iterable) | Joins the elements of an iterable (such as a list) into a single string, with the string as a separator |
lower() | Returns a lowercase version of a string |
lstrip() | Removes leading whitespace from a string |
replace(old, new) | Replaces all occurrences of a substring with another substring |
rstrip() | Removes trailing whitespace from a string |
split(separator) | Splits a string into a list of substrings, using the specified separator |
startswith(prefix) | Returns True if a string starts with a specified prefix |
strip() | Removes leading and trailing whitespace from a string |
upper() | Returns an uppercase version of a string |
Final Thoughts
By familiarizing yourself with the string intricacies we covered you'll greatly improve your skill for working with text-based data in Python. Patience is always welcomed as sophisticated programs will likely test your limits.
In the meantime, don't be afraid to experiment with what you've learned here, and be sure to check out the useful resources below for more information.