Every data professional knows the feeling. You open a new project, and the first thing you need to do is clean, transform, or merge some data. And every time, you end up writing the same boilerplate code.
Load CSV. Drop duplicates. Handle missing values. Merge sheets. Save to Excel.
Sound familiar?
I got tired of rewriting these patterns across projects, so I built a set of reusable Python templates that handle the 10 most common data processing tasks.
DataForge Pro is a collection of 10 production-ready Python templates for data processing. Each template is a standalone script that you can copy, customize, and integrate into your workflow.
The foundation template. Load any file (CSV, Excel, JSON), preview its structure, and save it in a different format.
from core import DataForge
df = (DataForge()
.load('data.csv')
.preview()
.save('output.xlsx'))
Remove duplicates, handle missing values, trim whitespace, and standardize column names — all in a chainable API.
df = (DataForge()
.load('messy_data.csv')
.remove_duplicates()
.drop_empty_rows()
.trim_whitespace()
.standardize_columns()
.save('clean_data.xlsx'))
Convert between CSV, Excel (.xlsx/.xls), and JSON with a single line.
# CSV to Excel
DataForge().load('data.csv').save('data.xlsx')
# Excel to JSON
DataForge().load('data.xlsx').save('data.json')
The Excel VLOOKUP equivalent in Python. Match and merge data from two files using a common key column.
df = (DataForge()
.load('orders.csv')
.vlookup('customers.xlsx', 'CustomerID', ['Name', 'Email', 'City'])
.save('enriched_orders.xlsx'))
Create Excel-style pivot tables with group-by and aggregation functions.
df = (DataForge()
.load('sales.csv')
.pivot(group_by=['Region', 'Product'],
agg={'Revenue': 'sum', 'Quantity': 'count'})
.save('pivot_report.xlsx'))
Find differences between two datasets — added rows, removed rows, and changed values.
diff = DataForge().compare('old_data.csv', 'new_data.csv')
diff.save_report('changes.xlsx')
Process multiple files at once — apply the same transformation to an entire folder.
df = (DataForge()
.batch_load('data_folder/*.csv')
.remove_duplicates()
.save('combined_output.xlsx'))
Work with multiple sheets in a single Excel file — read, write, and transform across sheets.
Command-line interface for quick operations without writing Python code.
python core.py load data.csv drop_duplicates save output.xlsx
A template showing how to create your own custom transformations and add them to the chain.
pip install pandas)pip install openpyxl)pip install xlrd)Stop rewriting the same data processing code. Get 10 ready-to-use templates and start saving hours every week.
Also available on Gumroad and SellAnyCode.
Questions? Message me anytime. Happy coding!