The Hello World of ML: Building Your First Linear Regression Model
The Core Concept
In traditional programming (like app development), you write the rules to get an answer:
Input + Rules = Output
In Machine Learning, you flip this. You give the computer the answer, and it figures out the rules:
Input + Output = Rules
Today, we will build a model that predicts a salary based on years of experience. We aren't going to write the math; we are going to let the machine learn the pattern.
The Mission
We have a dataset of employees with two columns:
Years of Experience (The Feature / Input)
Salary (The Label / Output)
Our goal is to train a model that can take a new number (e.g., 5.5 years) and predict the salary, even if that specific number wasn't in our original data.
The Setup
Ensure you have your environment ready (as discussed in the previous post). Create a new file named salary_predictor.py or open a new Jupyter Notebook.
The Code
Here is the complete, minimal code to train your first model.
Deconstructing the Magic
1. Data Preparation (Numpy)
Machine learning models digest numbers in specific shapes. X_train is our input. Notice the double brackets [[1.1], [1.3]...]. This is because the model expects a 2D array (a table of rows and columns), even if we only have one column of data right now.
2. The "Fit" Method
model.fit(X_train, y_train) is where the learning happens.
This single line of code runs the mathematical optimization (Gradient Descent) to find the line of best fit. It calculates the slope and the y-intercept that minimizes the error between the points.
3. The Prediction
Once fit executes, the model is trained. It now holds the "logic" in its memory. When we call .predict(), it applies that learned logic to our new input (5.0 years).
Technical Empowerment: What actually happened?
The model didn't memorize the salaries. It calculated a formula.
In high school math, you learned the equation for a straight line:
y: Salary
x: Years of Experience
m: Coefficient (How much salary increases per 1 year)
c: Intercept (The starting salary for 0 years experience)
By printing model.coef_ and model.intercept_, you can see the exact formula the machine "learned" from your data.
Congratulations. You just taught a rock how to think.
Comments
Post a Comment