๐ผ Pandas Concepts with Demo Programs
This blog covers essential Pandas concepts with simple code examples. Each demo has syntax highlighting and a copy button — perfect for learning or sharing!
1. Importing Pandas & NumPy
import pandas as pd
import numpy as np
2. Creating Series & DataFrames
# Series
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s)
# DataFrame from NumPy array
dates = pd.date_range("20230101", periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
print(df)
# DataFrame from dictionary
df2 = pd.DataFrame({
"A": 1.0,
"B": pd.Timestamp("20230102"),
"C": pd.Series(1, index=list(range(4)), dtype="float32"),
"D": np.array([3]*4, dtype="int32"),
"E": pd.Categorical(["test", "train", "test", "train"]),
"F": "foo",
})
print(df2)
print(df2.dtypes)
3. Viewing Data
print(df.head()) # Top 5 rows
print(df.tail(3)) # Last 3 rows
4. Selecting Columns
print(df["A"]) # Single column
print(df[["A", "B"]]) # Multiple columns
5. Selecting Rows
print(df.loc[dates[0]]) # Row by label
print(df.iloc[0:3, 0:2]) # Row and column by integer position
6. Boolean Indexing
print(df[df["A"] > 0]) # Rows where column A > 0
print(df[df > 0]) # All positive values
7. Setting Values
df.iloc[0, 1] = 100
df.loc[:, "B"] = np.array([1]*6)
print(df)
8. Handling Missing Data
df2 = df.copy()
df2.iloc[0, 0] = np.nan
print(df2.dropna()) # Drop rows with NaN
print(df2.fillna(0)) # Fill NaN with 0
9. Operations & Descriptive Statistics
print(df.mean()) # Column-wise mean
print(df.cumsum()) # Cumulative sum
print(df.describe()) # Summary statistics
10. Apply & Map
df2 = df.apply(np.cumsum)
print(df2)
df["A"] = df["A"].map(lambda x: x*2)
print(df)
11. Sorting
df.sort_index(axis=1, ascending=False, inplace=True) # Sort columns
df.sort_values(by="B", ascending=True, inplace=True) # Sort rows by column B
print(df)
12. Merging & Concatenating
df1 = pd.DataFrame(np.random.randn(3, 4), columns=list("ABCD"))
df2 = pd.DataFrame(np.random.randn(3, 4), columns=list("ABCD"))
result = pd.concat([df1, df2])
print(result)
13. Grouping
df = pd.DataFrame({
"A": ["foo", "bar", "foo", "bar"],
"B": ["one", "one", "two", "two"],
"C": np.random.randn(4),
"D": np.random.randn(4)
})
grouped = df.groupby("A").sum()
print(grouped)
No comments:
Post a Comment