✅ How MinMaxScaler Works

MinMaxScaler rescales each feature column-wise to the range [0, 1]:

Applied per column, not per row.

📌 Your Input

Feature Index	Values	Min	Max	Range
0	1.0, 2.0, 3.0	1.0	3.0	2.0
1	0.1, 1.1, 10.1	0.1	10.1	10.0
2	-1.0, 1.0, 3.0	-1.0	3.0	4.0

Index 0:

(2.0−1.0)/(3.0−1.0)=1/2=0.5

Index 1:

(1.1−0.1)/(10.1−0.1)=1/10=0.1

Index 2:

(1.0−(−1.0))/(3.0−(−1.0))=2/4=0.5

✔ Output: [0.5, 0.1, 0.5]

All are max values → scale to 1:

✔ [1.0, 1.0, 1.0]

For row 0:

Index 0:

(1.0−1.0)/2.0=0

Index 1:

(0.1−0.1)/10.0=0

Index 2:

(−1.0−(−1.0))/4.0=0

👉 All scaled values are 0:

But Spark uses SparseVector format to store vectors with many zeros:

This means:

This is equivalent to:

id	features	scaled_features	Meaning
0	[1.0,0.1,-1.0]	(3,[],[])	all zeros → min values
1	[2.0,1.1,1.0]	[0.5,0.1,0.5]	mid-range values
2	[3.0,10.1,3.0]	[1.0,1.0,1.0]	max values

posted on 2025-11-23 21:05 ZhangZhihuiAAA 阅读(0) 评论(0) 收藏举报

刷新页面返回顶部


博客园 © 2004-2025 浙公网安备 33010602011771号浙ICP备2021040463号-3