Python Strings

Strings are sequences of characters used to represent text in Python. They are immutable objects with many built-in methods for manipulation. This tutorial covers string creation, operations, and formatting.

Creating Strings

Strings can be created with single, double, or triple quotes.

String Literals

# Single quotes
single = 'Hello, World!'

# Double quotes
double = "Python strings"

# Triple quotes for multiline
multiline = """This is a
multiline string
that spans multiple lines"""

# Empty string
empty = ""

print(multiline)

Choose quotes based on content. Triple quotes are great for multiline strings.

Escape Sequences

Use backslashes to include special characters.

# Escape sequences
newline = "Line 1\nLine 2"
tabbed = "Name:\tAlice"
quote = "She said \"Hello\""
backslash = "Path: C:\\folder\\file"

# Raw strings (no escape processing)
raw_path = r"C:\folder\file"

print(newline)
print(raw_path)

Raw strings with ‘r’ prefix ignore escapes. See escape sequences documentation.

String Operations

Basic operations for working with strings.

Concatenation and Repetition

# Concatenation
first = "Hello"
second = "World"
combined = first + " " + second  # "Hello World"

# Repetition
repeated = "Ha" * 3  # "HaHaHa"

# Membership
has_hello = "Hello" in combined  # True

print(combined)
print(repeated)

Use + for concatenation, * for repetition. The ‘in’ operator checks membership.

Indexing and Slicing

Access individual characters or substrings.

text = "Python"

# Indexing
first_char = text[0]     # 'P'
last_char = text[-1]     # 'n'
third_char = text[2]     # 't'

# Slicing
first_three = text[:3]   # 'Pyt'
middle = text[1:4]       # 'yth'
every_other = text[::2]  # 'Pto'
reversed_str = text[::-1]  # 'nohtyP'

print(f"First: {first_char}, Last: {last_char}")
print(f"Reversed: {reversed_str}")

Slicing uses [start:end:step] syntax. Learn more about string slicing.

String Methods

Strings have many built-in methods for common operations.

Case Conversion

text = "Hello World"

# Change case
upper = text.upper()      # 'HELLO WORLD'
lower = text.lower()      # 'hello world'
title = text.title()      # 'Hello World'
capitalize = text.capitalize()  # 'Hello world'

print(upper)
print(title)

These methods return new strings (strings are immutable).

Searching and Finding

text = "The quick brown fox jumps over the lazy dog"

# Find substrings
fox_index = text.find("fox")      # 16
missing = text.find("cat")        # -1

# Count occurrences
o_count = text.count("o")         # 4

# Check start/end
starts_with = text.startswith("The")  # True
ends_with = text.endswith("dog")      # True

print(f"Fox at index: {fox_index}")

find() returns -1 if not found. See string methods for all options.

Modifying Strings

text = "  Hello World  "

# Remove whitespace
stripped = text.strip()        # 'Hello World'
left_strip = text.lstrip()     # 'Hello World  '
right_strip = text.rstrip()    # '  Hello World'

# Replace
replaced = text.replace("World", "Python")  # '  Hello Python  '

# Split and join
words = text.split()           # ['Hello', 'World']
joined = " ".join(words)       # 'Hello World'

print(f"Stripped: '{stripped}'")
print(f"Words: {words}")

strip() removes whitespace, split() divides strings. Check str methods documentation.

Checking Content

# Check character types
is_alpha = "Hello".isalpha()        # True
is_digit = "123".isdigit()          # True
is_alnum = "Hello123".isalnum()     # True
is_space = "   ".isspace()          # True

# Check case
is_upper = "HELLO".isupper()        # True
is_lower = "hello".islower()        # True
is_title = "Hello World".istitle()  # True

print(is_alpha, is_digit, is_alnum)

These methods help validate string content.

String Formatting

Multiple ways to insert values into strings.

f-Strings (Python 3.6+)

name = "Alice"
age = 30
height = 5.6

# f-string formatting
message = f"Hello, {name}! You are {age} years old."
details = f"Name: {name.upper()}, Age: {age + 1} next year"

# Format numbers
price = 19.99
formatted = f"Price: ${price:.2f}"

print(message)
print(formatted)

f-strings are the modern, preferred way. They support format specification.

format() Method

# Positional arguments
template = "Hello, {}! You are {} years old."
message = template.format("Alice", 30)

# Named arguments
template = "Hello, {name}! Age: {age}"
message = template.format(name="Bob", age=25)

# Format numbers
price = 19.99
formatted = "Price: ${:.2f}".format(price)

print(message)
print(formatted)

format() works in all Python versions. See string formatting guide.

Old-style Formatting

# % formatting (older style)
name = "Alice"
age = 30
message = "Hello, %s! Age: %d" % (name, age)

# Format numbers
price = 19.99
formatted = "Price: $%.2f" % price

print(message)

% formatting is legacy but still works. f-strings are preferred.

String Immutability

Strings cannot be changed after creation.

text = "Hello"
# text[0] = "h"  # TypeError!

# Instead, create new string
new_text = "h" + text[1:]  # 'hello'

# Methods return new strings
upper_text = text.upper()  # 'HELLO'
print(text)                # Still 'Hello'

This design makes strings safe to use as dictionary keys. Learn about immutable objects.

String Encoding

Python 3 uses Unicode by default.

# Unicode strings
greek = "Γειά σου κόσμε"  # Greek: "Hello world"
emoji = "🚀🌟"           # Rocket and star

# Encode to bytes
utf8_bytes = greek.encode('utf-8')
print(f"UTF-8 bytes: {utf8_bytes}")

# Decode back
decoded = utf8_bytes.decode('utf-8')
print(f"Decoded: {decoded}")

Unicode handles international text. See Unicode documentation for details.

Common String Patterns

Cleaning User Input

def clean_input(user_input):
    """Clean and validate user input."""
    if not user_input:
        return ""
    
    # Strip whitespace and convert to title case
    cleaned = user_input.strip().title()
    return cleaned

user_name = "  john doe  "
clean_name = clean_input(user_name)
print(f"Clean: '{clean_name}'")

Input cleaning is essential for user data.

Text Processing

text = "The quick brown fox jumps over the lazy dog."

# Split into words
words = text.split()

# Count words
word_count = len(words)

# Find unique words
unique_words = set(words)

# Create acronym
acronym = "".join(word[0].upper() for word in words)

print(f"Words: {word_count}")
print(f"Unique: {len(unique_words)}")
print(f"Acronym: {acronym}")

Text processing is common in data analysis.

String Validation

def is_valid_email(email):
    """Simple email validation."""
    if "@" not in email or "." not in email:
        return False
    
    # Check basic structure
    parts = email.split("@")
    if len(parts) != 2:
        return False
    
    local, domain = parts
    return len(local) > 0 and len(domain) > 0

emails = ["user@example.com", "invalid", "user@", "@domain.com"]
for email in emails:
    print(f"{email}: {is_valid_email(email)}")

Validation ensures data quality.

Performance Tips

  • Use join() instead of + for concatenation in loops
  • Prefer f-strings for formatting (fastest)
  • Use str.startswith() and str.endswith() for prefix/suffix checks
# Efficient concatenation
words = ["Hello", "World", "Python"]
sentence = " ".join(words)  # Fast

# Inefficient (don't do this)
sentence = ""
for word in words:
    sentence += word + " "  # Creates new string each time

join() is much more efficient for multiple concatenations.

Best Practices

  1. Use f-strings for new code (Python 3.6+)
  2. Handle encoding/decoding explicitly for file I/O
  3. Use raw strings for regular expressions and paths
  4. Prefer string methods over manual operations
  5. Validate and clean input strings

External Resources:

Related Tutorials:

Last updated on