In Python, every program is formed using valid characters and tokens. The character set defines which characters are allowed in a Python program, while tokens represent the smallest meaningful units such as keywords, identifiers, literals, operators, and symbols.
Character Set
A character set is the collection of valid characters that a programming language understands. Python supports a wide range of characters, making it flexible and easy to use. Python character set includes:
- Alphabets: A–Z, a–z
- Digits: 0–9
- Special symbols: + - * / % = @ # $ & _ etc.
- Whitespace characters: space, tab, newline
- Unicode characters: Python supports full Unicode
These characters are used to form keywords, variables, expressions and statements.
Tokens
A token is the smallest meaningful unit in a Python program. Python code is interpreted by breaking it into tokens. Python has the following types of tokens:
1. Keywords: Keywords are reserved words with special meaning in Python. They cannot be used as variable or function names. Examples of keywords: if, else, for, while, break, continue, True, False, import, class
for x in range(1, 6):
if x < 4:
continue
break
Here, for, if, continue and break are keywords.
2. Identifiers: Identifiers are names given to variables, functions, classes, etc. Rules for identifiers:
- Can contain letters, digits, and _
- Cannot start with a digit
- Cannot be a keyword
- Case-sensitive
name = "Geeks"
_count = 10
- Valid identifiers: name, _count
- Invalid identifiers: 2num, my-name, for
3. Literals (Values): Literals are the fixed values or data items used in a source code. Python supports different types of literals such as:
3.1 String Literals represent text values enclosed in quotes.
msg = "Hello Python"
print(msg)
Output
Hello Python
3.2 Numeric Literals represent integer or decimal numbers.
a = 10
b = 3.5
print(a)
print(b)
Output
10 3.5
3.3 Boolean Literals represent logical values: True or False.
is_valid = True
3.4 Special Literal None represents the absence of a value.
value = None
3.5 Collection Literals represent grouped data such as lists, tuples, dictionaries and sets.
a = [1, 2, 3]
tup = (1, 2)
d = {"a": 1}
s = {1, 2, 3}
print(a)
print(tup)
print(d)
print(s)
Output
[1, 2, 3]
(1, 2)
{'a': 1}
{1, 2, 3}
4. Operators: These are the tokens responsible to perform an operation in an expression. The variables on which operation is applied are called operands.
a = 5
b = 2
print(a + b)
print(~a)
Output
7 -6
5. Punctuators: These are the symbols that used in Python to organize the structures, statements, and expressions. Some of the Punctuators are: [ ] { } ( ) @ -= += *= //= **== = , etc.