intuitive-deep-learning

Lab-000: Scalars & Vectors

Implement and compare the performance of vector addition and dot product using standard Python lists and NumPy arrays, measuring the execution time for both implementations on large vectors to observe speed differences.

Notebook

Notes

In summary, use Python lists for general-purpose programming where flexibility, mixed data types, and frequent modifications are needed, and use NumPy arrays for performance-critical numerical computations involving large datasets.

The performance difference between Python lists and NumPy arrays is quite significant, especially for numerical operations on large datasets. Python lists are general-purpose dynamic arrays that can store elements of different data types, incurring overhead due to type checking and generic object storage. In contrast, NumPy arrays are designed for numerical computations and store elements of a single data type contiguously in memory. This contiguous storage, combined with NumPy’s underlying C implementation, allows for highly optimized, vectorized operations.

From the tests we ran, as vector size increases, NumPy arrays consistently outperform Python lists by orders of magnitude for both vector addition and dot product. For example, for a vector size of 100,000, NumPy operations are dramatically faster. This is because NumPy operations are implemented in C, enabling faster execution, and they leverage vectorization, meaning operations are applied to entire arrays at once rather than element by element, avoiding Python’s loop overhead. This makes NumPy the preferred choice for scientific computing and data analysis in Python when performance is critical. Comparison of performance of Python list and Numpy Array

Feature Python List NumPy Array
Data Types Heterogeneous (mixed types) Homogeneous (single type)
Memory Storage Non-contiguous (pointers to objects) Contiguous (packed data values)
Operations Slower, uses Python loops Faster, uses optimized C functions (vectorization)
Memory Efficiency Less efficient (high overhead per element) Highly efficient (compact storage)
Flexibility Dynamic sizing, easier append/insert Fixed size upon creation

Key Reasons for Speed Difference: