Python Tokens and Character Sets

Last Updated : 12 Jan, 2026

In Python, every program is formed using valid characters and tokens. The character set defines which characters are allowed in a Python program, while tokens represent the smallest meaningful units such as keywords, identifiers, literals, operators, and symbols.

Character Set

A character set is the collection of valid characters that a programming language understands. Python supports a wide range of characters, making it flexible and easy to use. Python character set includes:

  1. Alphabets: A–Z, a–z
  2. Digits: 0–9
  3. Special symbols: + - * / % = @ # $ & _ etc.
  4. Whitespace characters: space, tab, newline
  5. Unicode characters: Python supports full Unicode

These characters are used to form keywords, variables, expressions and statements.

Tokens 

A token is the smallest meaningful unit in a Python program. Python code is interpreted by breaking it into tokens. Python has the following types of tokens:

1. Keywords: Keywords are reserved words with special meaning in Python. They cannot be used as variable or function names. Examples of keywords: if, else, for, while, break, continue, True, False, import, class

Python
for x in range(1, 6):
    if x < 4:
        continue
    break

Here, for, if, continue and break are keywords.

2. Identifiers: Identifiers are names given to variables, functions, classes, etc. Rules for identifiers:

  • Can contain letters, digits, and _
  • Cannot start with a digit
  • Cannot be a keyword
  • Case-sensitive
Python
name = "Geeks"
_count = 10
  • Valid identifiers: name, _count
  • Invalid identifiers: 2num, my-name, for

3. Literals (Values): Literals are the fixed values or data items used in a source code. Python supports different types of literals such as:

3.1 String Literals represent text values enclosed in quotes.

Python
msg = "Hello Python"
print(msg)

Output
Hello Python

3.2 Numeric Literals represent integer or decimal numbers.

Python
a = 10
b = 3.5
print(a)
print(b)

Output
10
3.5

3.3 Boolean Literals represent logical values: True or False.

Python
is_valid = True

3.4 Special Literal None represents the absence of a value.

Python
value = None

3.5 Collection Literals represent grouped data such as lists, tuples, dictionaries and sets.

Python
a = [1, 2, 3]
tup = (1, 2)
d = {"a": 1}
s = {1, 2, 3}

print(a)
print(tup)
print(d)
print(s)

Output
[1, 2, 3]
(1, 2)
{'a': 1}
{1, 2, 3}

4. Operators: These are the tokens responsible to perform an operation in an expression. The variables on which operation is applied are called operands.

Python
a = 5
b = 2
print(a + b)   
print(~a)    

Output
7
-6

5. Punctuators: These are the symbols that used in Python to organize the structures, statements, and expressions. Some of the Punctuators are: [ ] { } ( ) @  -=  +=  *=  //=  **==  = , etc.

Comment
Article Tags:

Explore