{ "cells": [ { "cell_type": "markdown", "id": "f2e36467", "metadata": {}, "source": [ "# NumPy 101" ] }, { "cell_type": "code", "execution_count": null, "id": "04041152", "metadata": {}, "outputs": [], "source": [ "# Assoc. Prof. Dr. Piyabute Fuangkhon\n", "# Department of Digital Business Management\n", "# Martin de Tours School of Management and Economics\n", "# Assumption University\n", "# Update: 22/05/2024" ] }, { "cell_type": "markdown", "id": "ba65b924", "metadata": {}, "source": [ "## Introduction to NumPy for Data Analytics\n", "\n", "NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and many mathematical functions. Let's look at the basics of NumPy and how it can be used for data analytics. This tutorial will guide you through various operations, from basic array creation to complex linear algebra and random number generation." ] }, { "cell_type": "markdown", "id": "95ab23ab", "metadata": {}, "source": [ "## Importing the NumPy Library\n", "\n", "We need to import the NumPy library. It's a common practice to import it with the alias 'np'." ] }, { "cell_type": "code", "execution_count": null, "id": "14a8c26e", "metadata": {}, "outputs": [], "source": [ "# Import the NumPy library with the alias 'np'.\n", "import numpy as np\n", "print(np.__version__)" ] }, { "cell_type": "markdown", "id": "8513fc62", "metadata": {}, "source": [ "## Array Creation\n", "\n", "NumPy arrays are similar to lists in Python, but they allow for more efficient operations. We'll start with creating simple arrays and then explore different ways to initialize arrays.\n", "\n", "Let's create arrays that could represent sales data, inventory levels, or any other sequential data that you might encounter in business." ] }, { "cell_type": "code", "execution_count": null, "id": "0885e81b", "metadata": {}, "outputs": [], "source": [ "# Step 1: Create NumPy array using np.array\n", "array_from_list = np.array([1, 2, 3, 4, 5]) # A 1D array from a list\n", "print(\"1D array from list =\", array_from_list)\n", "\n", "# Step 2: Create NumPy array using np.arange\n", "array_arange = np.arange(10) # A 1D array with values from 0 to 9\n", "print(\"1D array using arange =\", array_arange)\n", "\n", "# Step 3: Create NumPy array using np.linspace\n", "array_linspace = np.linspace(0, 1, 5) # A 1D array with 5 values evenly spaced between 0 and 1\n", "print(\"1D array using linspace =\", array_linspace)\n", "print()\n", "\n", "# Step 4: Create NumPy 2D array using np.array\n", "array_from_list_2d = np.array([[1, 2, 3], [4, 5, 6]]) # Use np.array to create a two-dimensional array with elements [[1, 2, 3], [4, 5, 6]]\n", "print(\"2D array from list =\\n\", array_from_list_2d)\n", "\n", "# Step 5: Create NumPy array using np.zeros\n", "array_of_zeros = np.zeros((3, 4)) # A 2D array of zeros with shape (3, 4)\n", "print(\"2D array of zeros =\\n\", array_of_zeros)\n", "\n", "# Step 6: Create NumPy array using np.ones\n", "array_of_ones = np.ones((2, 3)) # A 2D array of ones with shape (2, 3)\n", "print(\"2D array of ones =\\n\", array_of_ones)\n", "\n", "# Step 7: Create NumPy array using np.arange and np.reshape\n", "array_of_sequence = np.arange(1, 2 * 3 + 1).reshape(2, 3) # A 2D array of sequence numbers with shape (2, 3)\n", "print(\"2D array of sequence numbers =\\n\", array_of_sequence)" ] }, { "cell_type": "markdown", "id": "9a66c4e6", "metadata": {}, "source": [ "## Array Attributes\n", "\n", "Understanding array attributes is crucial for manipulating data effectively. We'll explore the shape, size, dimensions, and data type of arrays.\n", "\n", "Consider an array representing monthly sales data for a product. Understanding its shape and size helps in reshaping and performing aggregate functions." ] }, { "cell_type": "code", "execution_count": null, "id": "6b2a6bd2", "metadata": {}, "outputs": [], "source": [ "# Step 1: Create arrays\n", "array_from_list = np.array([1, 2, 3, 4, 5])\n", "array_of_zeros = np.zeros((3, 4))\n", "array_of_ones = np.ones((2, 3))\n", "\n", "# Step 2: Print arrays\n", "print(\"Variables\")\n", "print(\"============================================================================\")\n", "print(\"Array_from_list =>\\n\", array_from_list)\n", "print(\"Array_of_zeros =>\\n\", array_of_zeros)\n", "print(\"Array_of_ones =>\\n\", array_of_ones)\n", "\n", "# Step 3: Shape of arrays\n", "print(\"\\nShape of arrays (array.shape)\")\n", "print(\"============================================================================\")\n", "print(\"Array_from_list =\", array_from_list.shape) # The shape attribute returns the dimensions of the array\n", "print(\"Array_of_zeros =\", array_of_zeros.shape) # The shape attribute returns the dimensions of the array\n", "print(\"Array_of_ones =\", array_of_ones.shape) # The shape attribute returns the dimensions of the array\n", "\n", "# Step 4: Size of arrays\n", "print(\"\\nSize of arrays (array.size)\")\n", "print(\"============================================================================\")\n", "print(\"Array_from_list =\", array_from_list.size) # The size attribute returns the total number of elements in the array\n", "print(\"Array_of_zeros =\", array_of_zeros.size) # The size attribute returns the total number of elements in the array\n", "print(\"Array_of_ones =\", array_of_ones.size) # The size attribute returns the total number of elements in the array\n", "\n", "# Step 5: Number of dimensions of arrays\n", "print(\"\\nNumber of dimensions of arrays (array.ndim)\")\n", "print(\"============================================================================\")\n", "print(\"Array_from_list =\", array_from_list.ndim) # The ndim attribute returns the number of dimensions of the array\n", "print(\"Array_of_zeros =\", array_of_zeros.ndim) # The ndim attribute returns the number of dimensions of the array\n", "print(\"Array_of_ones =\", array_of_ones.ndim) # The ndim attribute returns the number of dimensions of the array\n", "\n", "# Step 6: Data type of arrays\n", "print(\"\\nData type of arrays (array.dtype)\")\n", "print(\"============================================================================\")\n", "print(\"Array_from_list =\", array_from_list.dtype) # The dtype attribute returns the data type of the elements in the array\n", "print(\"Array_of_zeros =\", array_of_zeros.dtype) # The dtype attribute returns the data type of the elements in the array\n", "print(\"Array_of_ones =\", array_of_ones.dtype) # The dtype attribute returns the data type of the elements in the array)" ] }, { "cell_type": "markdown", "id": "7be15687", "metadata": {}, "source": [ "## Array Indexing and Slicing\n", "\n", "Techniques used to access, modify, and extract specific elements, subarrays, or ranges within an array based on their positions. This is particularly useful when working with subsets of data, such as specific months in sales data or particular products in inventory data." ] }, { "cell_type": "code", "execution_count": null, "id": "a2b9e400", "metadata": {}, "outputs": [], "source": [ "# Assuming arrays are already created in previous steps\n", "array_from_list = np.array([1, 2, 3, 4, 5])\n", "array_of_sequence = np.arange(1, 2 * 3 + 1).reshape(2, 3)\n", "\n", "# Step 1: Accessing elements in 1D array and 2D array\n", "print(\"1D - array (array_from_list) =>\\n\", array_from_list)\n", "print(\"2D - array (array_of_sequence) =>\\n\", array_of_sequence)\n", "\n", "print(\"\\nAccessing elements\")\n", "print(\"============================================================================\")\n", "print(\"1D - Element at index 2 in array_from_list =\", array_from_list[2]) # Accessing the third element (index 2) in the 1D array\n", "print(\"2D - Element at row 1, column 2 in array_of_sequence =\", array_of_sequence[1, 2]) # Accessing the element at row 1, column 2 in the 2D array\n", "\n", "# Step 2: Slicing 1D array and 2D array\n", "print(\"\\nSlicing\")\n", "print(\"============================================================================\")\n", "print(\"1D - Elements from index 1 to 3 in array_from_list =\\n\", array_from_list[1:4]) # Slicing elements from index 1 to 3 in the 1D array (note that the end index is exclusive)\n", "print(\"2D - First two rows, first two columns of array_of_sequence =\\n\", array_of_sequence[:2, :2]) # Slicing the first two rows and first two columns in the 2D array\n", "\n", "# Step 3: Boolean indexing 1D array and 2D array\n", "print(\"\\nBoolean indexing\")\n", "print(\"============================================================================\")\n", "bool_idx = array_from_list > 3 # Creating a boolean index array for elements greater than 3 in the 1D array\n", "print(\"1D - Elements greater than 3 in array_from_list =\", array_from_list[bool_idx])\n", "bool_idx_2d = array_of_sequence > 3 # Creating a boolean index array for elements greater than 3 in the 2D array\n", "print(\"2D - Elements greater than 3 in array_of_sequence =\", array_of_sequence[bool_idx_2d])\n", "\n", "# Step 4: Fancy indexing 1D array and 2D array\n", "print(\"\\nFancy indexing\")\n", "print(\"============================================================================\")\n", "fancy_idx = [0, 2, 4] # Fancy indexing for specific indices in the 1D array\n", "print(\"1D - Elements at indices 0, 2, and 4 in array_from_list =\\n\", array_from_list[fancy_idx])\n", "fancy_idx_rows = [0, 1] # Fancy indexing for specific row indices in the 2D array\n", "fancy_idx_cols = [1, 2] # Fancy indexing for specific column indices in the 2D array\n", "print(\"2D - Elements at row indices [0, 1] and column indices [1, 2] in array_of_sequence =\\n\", array_of_sequence[fancy_idx_rows, :][:, fancy_idx_cols])\n" ] }, { "cell_type": "markdown", "id": "6bfbe7f6", "metadata": {}, "source": [ "## Array Operations\n", "\n", "Actions performed on arrays, such as arithmetic calculations, element-wise operations, and transformations, to manipulate and analyze data stored in arrays. These operations can help calculate profit margins, growth rates, and other business metrics." ] }, { "cell_type": "code", "execution_count": null, "id": "e93ca23f", "metadata": {}, "outputs": [], "source": [ "# Step 1: Define two 1D arrays for element-wise operations\n", "array1 = np.array([1, 2, 3, 4, 5])\n", "array2 = np.array([10, 20, 30, 40, 50])\n", "\n", "# Print the original arrays\n", "print(\"array1 =>\", array1)\n", "print(\"array2 =>\", array2)\n", "print()\n", "\n", "# Step 2: Perform and print element-wise addition\n", "print(\"array1 + array2 =\", array1 + array2) # Perform element-wise addition\n", "\n", "# Step 3: Perform and print element-wise subtraction\n", "print(\"array1 - array2 =\", array1 - array2) # Perform element-wise subtraction\n", "\n", "# Step 4: Perform and print element-wise multiplication\n", "print(\"array1 * array2 =\", array1 * array2) # Perform element-wise multiplication\n", "\n", "# Step 5: Perform and print element-wise division\n", "print(\"array1 / array2 =\", array1 / array2) # Perform element-wise division" ] }, { "cell_type": "code", "execution_count": null, "id": "c1fd7f4c", "metadata": {}, "outputs": [], "source": [ "# Step 1: Define two arrays for broadcasting operations\n", "array3 = np.array([[1, 2, 3], [4, 5, 6]])\n", "array4 = np.array([10, 20, 30])\n", "\n", "# Print the original arrays\n", "print(\"array3 =>\\n\", array3)\n", "print(\"array4 =>\\n\", array4)\n", "print()\n", "\n", "# Step 2: Perform and print broadcasting operations\n", "print(\"array3 + array4 =\\n\", array3 + array4) # Calculate and print broadcasting addition\n", "print(\"array3 - array4 =\\n\", array3 - array4) # Calculate and print broadcasting subtraction\n", "print(\"array3 * array4 =\\n\", array3 * array4) # Calculate and print broadcasting multiplication\n", "print(\"array3 / array4 =\\n\", array3 / array4) # Calculate and print broadcasting division" ] }, { "cell_type": "markdown", "id": "d4e08c0c", "metadata": {}, "source": [ "## Aggregate Functions\n", "\n", "Operations that process multiple elements of an array to return a single value, such as sum, mean, minimum, maximum, and standard deviation." ] }, { "cell_type": "code", "execution_count": null, "id": "ea2a3c06", "metadata": {}, "outputs": [], "source": [ "# Step 1: Define a 1D array for aggregate functions\n", "array1 = np.array([1, 2, 3, 4, 5])\n", "\n", "# Print the original array\n", "print(\"array1 =>\", array1)\n", "print()\n", "\n", "# Step 2: Perform and print aggregate functions\n", "print(\"Sum =\", np.sum(array1)) # Calculate and print the sum of the array elements\n", "print(\"Mean =\", np.mean(array1)) # Calculate and print the mean (average) of the array elements\n", "print(\"Standard Deviation =\", np.std(array1)) # Calculate and print the standard deviation of the array elements\n", "print(\"Minimum =\", np.min(array1)) # Calculate and print the minimum value in the array\n", "print(\"Maximum =\", np.max(array1)) # Calculate and print the maximum value in the array" ] }, { "cell_type": "markdown", "id": "0e9964c5", "metadata": {}, "source": [ "## Mathematical Functions\n", "\n", "Functions that perform various mathematical computations on array elements, including operations like sine, cosine, exponential, and logarithm." ] }, { "cell_type": "code", "execution_count": null, "id": "7fa8e6c4", "metadata": {}, "outputs": [], "source": [ "# Step 1: Define arrays for angles (in radians) and values\n", "angles = np.array([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi])\n", "values = np.array([1, 2, 3, 4, 5])\n", "\n", "# Print the original arrays\n", "print(\"angles =>\", angles)\n", "print(\"values =>\", values)\n", "print()\n", "\n", "# Step 2: Perform and print mathematical functions\n", "print(\"sin(angles) =\", np.sin(angles)) # Calculate and print the sine of the angles\n", "print(\"cos(angles) =\", np.cos(angles)) # Calculate and print the cosine of the angles\n", "print(\"exp(values) =\", np.exp(values)) # Calculate and print the exponential of the values\n", "print(\"log(values) =\", np.log(values)) # Calculate and print the natural logarithm of the values" ] }, { "cell_type": "markdown", "id": "2fcffece", "metadata": {}, "source": [ "## Reshaping and Resizing Arrays\n", "\n", "Processes of changing the dimensions or structure of an array without altering its data, including operations like reshaping into different shapes and resizing to adjust the number of elements." ] }, { "cell_type": "code", "execution_count": null, "id": "7b4bc1c8", "metadata": {}, "outputs": [], "source": [ "# Step 1: Reshape\n", "array_original = np.arange(1, 13) # Creating an array and reshaping it\n", "print(\"Original array (1D) =>\", array_original)\n", "\n", "array_reshaped = array_original.reshape(3, 4) # Reshaping the original 1D array to a 3x4 2D array\n", "print(\"\\nReshaped array (3x4) =\\n\", array_reshaped)\n", "\n", "# Step 2: Ravel\n", "array_raveled = array_reshaped.ravel() # Flattening the 2D array back to a 1D array using ravel\n", "print(\"\\nRaveled array (1D) =\\n\", array_raveled)\n", "\n", "# Step 3: Flatten\n", "array_flattened = array_reshaped.flatten() # Flattening the 2D array back to a 1D array using flatten\n", "print(\"\\nFlattened array (1D) =\\n\", array_flattened)\n", "\n", "# Step 4: Transpose\n", "array_transposed = array_reshaped.transpose() # Transposing the 3x4 2D array to a 4x3 2D array\n", "print(\"\\nTransposed array (3x4 to 4x3) =\\n\", array_transposed)\n" ] }, { "cell_type": "markdown", "id": "2b661e80", "metadata": {}, "source": [ "## Stacking and Splitting Arrays\n", "\n", "Operations that combine multiple arrays into a single array along a specified axis (stacking) or divide an array into multiple sub-arrays along a specified axis (splitting)." ] }, { "cell_type": "code", "execution_count": null, "id": "ff677c05", "metadata": {}, "outputs": [], "source": [ "# Step 1: Horizontal stacking (hstack)\n", "array1 = np.array([1, 2, 3])\n", "array2 = np.array([4, 5, 6])\n", "array_hstack = np.hstack((array1, array2)) # Horizontal stacking array1 and array2\n", "print(\"array1 =>\", array1)\n", "print(\"array2 =>\", array2)\n", "print()\n", "print(\"Horizontal stacking array1 and array2 =\", array_hstack)\n", "\n", "# Step 2: Vertical stacking (vstack)\n", "array3 = np.array([[1, 2, 3], [4, 5, 6]])\n", "array4 = np.array([[7, 8, 9], [10, 11, 12]])\n", "print()\n", "print(\"array3 =>\\n\", array3)\n", "print(\"array4 =>\\n\", array4)\n", "array_vstack = np.vstack((array3, array4)) # Vertical stacking array3 and array4\n", "print()\n", "print(\"Vertical stacking array3 and array4 =\\n\", array_vstack)\n", "\n", "# Step 3: Depth stacking (dstack)\n", "array5 = np.array([[1, 2, 3], [4, 5, 6]])\n", "array6 = np.array([[7, 8, 9], [10, 11, 12]])\n", "array_dstack = np.dstack((array5, array6)) # Depth stacking array5 and array6\n", "print()\n", "print(\"array5 =>\\n\", array5)\n", "print(\"array6 =>\\n\", array6)\n", "print()\n", "print(\"Depth stacking array5 and array6 =\\n\", array_dstack)\n", "\n", "# Step 4: Splitting arrays\n", "array7 = np.arange(1, 13).reshape(3, 4)\n", "array_hsplit = np.hsplit(array7, 2) # Horizontal split (hsplit)\n", "print()\n", "print(\"array7 =>\\n\", array7)\n", "print()\n", "print(\"Horizontal splitting array7 =\")\n", "for i, arr in enumerate(array_hsplit):\n", " print(f\"Part {i}:\\n{arr}\")\n", "\n", "# Step 5: Vertical split (vsplit)\n", "array_vsplit = np.vsplit(array7, 3) # Vertical split (vsplit)\n", "print(\"\\nVertical splitting array7 =\")\n", "for i, arr in enumerate(array_vsplit):\n", " print(f\"Part {i}:\\n{arr}\")\n", "\n", "# Step 6: Depth split (dsplit)\n", "array8 = np.dstack((array5, array6, array5))\n", "print()\n", "print(\"array8 =>\\n\", array8)\n", "print()\n", "array_dsplit = np.dsplit(array8, 3) # Depth split (dsplit)\n", "print(\"\\nDepth splitting array8 =\")\n", "for i, arr in enumerate(array_dsplit):\n", " print(f\"Part {i}:\\n{arr}\")" ] }, { "cell_type": "markdown", "id": "538e226b", "metadata": {}, "source": [ "## Linear Algebra\n", "\n", "Computational procedures involving matrices and vectors, such as matrix multiplication, calculating determinants, finding eigenvalues and eigenvectors, and solving systems of linear equations. These techniques are useful in optimization problems, financial modeling, and various analytical tasks." ] }, { "cell_type": "code", "execution_count": null, "id": "98e83be6", "metadata": {}, "outputs": [], "source": [ "# Step 1: Dot product\n", "vector1 = np.array([1, 2, 3])\n", "vector2 = np.array([4, 5, 6])\n", "print(\"vector1 =>\", vector1)\n", "print(\"vector2 =>\", vector2)\n", "dot_product = np.dot(vector1, vector2) # Calculate the dot product of vector1 and vector2\n", "print(\"\\nDot product of vector1 and vector2 =\", dot_product)\n", "\n", "# Step 2: Matrix multiplication\n", "matrix1 = np.array([[1, 2], [3, 4]])\n", "matrix2 = np.array([[5, 6], [7, 8]])\n", "print()\n", "print(\"matrix1 =>\\n\", matrix1)\n", "print(\"matrix2 =>\\n\", matrix2)\n", "matrix_multiplication = np.matmul(matrix1, matrix2) # Calculate the matrix multiplication of matrix1 and matrix2\n", "print(\"\\nMatrix multiplication of matrix1 and matrix2 =\\n\", matrix_multiplication)\n", "\n", "# Step 3: Determinant\n", "matrix3 = np.array([[1, 2], [3, 4]])\n", "print()\n", "print(\"matrix3 =>\\n\", matrix3)\n", "determinant = np.linalg.det(matrix3) # Calculate the determinant of matrix3\n", "print(\"\\nDeterminant of matrix3 =\", determinant)\n", "\n", "# Step 4: Eigenvalues and eigenvectors\n", "matrix4 = np.array([[1, 2], [2, 1]])\n", "print()\n", "print(\"matrix4 =>\\n\", matrix4)\n", "eigenvalues, eigenvectors = np.linalg.eig(matrix4) # Calculate the eigenvalues and eigenvectors of matrix4\n", "print(\"\\nEigenvalues of matrix4 =\", eigenvalues)\n", "print(\"Eigenvectors of matrix4 =\\n\", eigenvectors)\n", "\n", "# Step 5: Solving linear equations\n", "A = np.array([[2, 1], [1, 3]])\n", "b = np.array([1, 2])\n", "print()\n", "print(\"A =>\\n\", A)\n", "print(\"b =>\\n\", b)\n", "solution = np.linalg.solve(A, b) # Solve the system of linear equations Ax = b\n", "print(\"\\nSolution of the system of linear equations Ax = b =\", solution)" ] }, { "cell_type": "markdown", "id": "77f10faf", "metadata": {}, "source": [ "## Random Number Generation\n", "\n", "The process of using Python's NumPy library to create sequences of random numbers for various applications like simulations and data analysis." ] }, { "cell_type": "code", "execution_count": null, "id": "8af15566", "metadata": {}, "outputs": [], "source": [ "# Step 1: Generating random numbers\n", "random_numbers = np.random.rand(3) # Generate a 1D array of 3 random numbers between 0 and 1\n", "print(\"Generating random numbers (0 to 1) =\", random_numbers)\n", "\n", "# Step 2: Setting random seed\n", "np.random.seed(100) # Set the seed for reproducibility\n", "random_numbers_seeded = np.random.rand(3) # Generate a 1D array of 3 random numbers with seed 100\n", "print(\"Generating random numbers with seed 100 =\", random_numbers_seeded)\n", "\n", "# Step 3: Random sampling\n", "random_integers = np.random.randint(10, 50, 3) # Generate a 1D array of 3 random integers between 10 and 50\n", "print(\"Random sampling (integers between 10 and 50) =\", random_integers)\n", "\n", "# Step 4: Random distributions\n", "random_normal = np.random.randn(3) # Generate a 1D array of 3 random numbers from a normal distribution (mean=0, std=1)\n", "print(\"Random numbers from a normal distribution (mean=0, std=1) =\", random_normal)\n", "\n", "# Step 5: Uniform distribution\n", "random_uniform = np.random.uniform(0, 10, 3) # Generate a 1D array of 3 random numbers from a uniform distribution between 0 and 10\n", "print(\"Random numbers from a uniform distribution (between 0 and 10) =\", random_uniform)" ] }, { "cell_type": "markdown", "id": "1daed295", "metadata": {}, "source": [ "## File Input and Output Operations\n", "\n", "Actions that facilitate reading data from external sources (input) and writing data to external destinations (output), enabling interaction with files." ] }, { "cell_type": "code", "execution_count": null, "id": "9b871fdf", "metadata": {}, "outputs": [], "source": [ "# Step 1: Saving and loading 1D arrays\n", "array_to_save = np.array([1, 2, 3, 4, 5]) # Create a sample array\n", "print(\"array_to_save =>\", array_to_save)\n", "\n", "np.savetxt('array.txt', array_to_save) # Save the array to a text file\n", "print(\"\\nArray saved to 'array.txt'\")\n", "\n", "loaded_array_txt = np.loadtxt('array.txt') # Load the array from the text file\n", "print(\"\\nArray loaded from 'array.txt' =\", loaded_array_txt)\n", "\n", "np.save('array.npy', array_to_save) # Save the array to a binary file\n", "print(\"\\nArray saved to 'array.npy'\")\n", "\n", "loaded_array_npy = np.load('array.npy') # Load the array from the binary file\n", "print(\"\\nArray loaded from 'array.npy' =\", loaded_array_npy)\n", "\n", "# Step 2: Working with 2D arrays and text files\n", "array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Create a 2D array\n", "\n", "np.savetxt('array_2d.txt', array_2d, delimiter=',') # Save the 2D array to a text file with a custom delimiter\n", "print(\"\\n2D array saved to 'array_2d.txt' with comma delimiter\")\n", "\n", "loaded_array_2d_txt = np.loadtxt('array_2d.txt', delimiter=',') # Load the 2D array from the text file\n", "print(\"\\n2D array loaded from 'array_2d.txt' =\\n\", loaded_array_2d_txt)\n", "\n", "# Step 3: Working with 2D arrays and binary files\n", "np.save('array_2d.npy', array_2d) # Save the 2D array to a binary file\n", "print(\"\\n2D array saved to 'array_2d.npy'\")\n", "\n", "loaded_array_2d_npy = np.load('array_2d.npy') # Load the 2D array from the binary file\n", "print(\"\\n2D array loaded from 'array_2d.npy' =\\n\", loaded_array_2d_npy)" ] }, { "cell_type": "markdown", "id": "08c8e9b5", "metadata": {}, "source": [ "# Practice\n", "\n", "The code block below reads sales data from a URL and stores the data in an array. The first row in the dataset defines the attribute names. Your task is to find and display useful information (aggregated data) from this data using NumPy library." ] }, { "cell_type": "code", "execution_count": null, "id": "49b12005", "metadata": {}, "outputs": [], "source": [ "# Step 1: Import the 'urllib' library\n", "import urllib\n", "import numpy as np\n", "\n", "# Step 2: Specify the URL of the file to be opened\n", "url = \"https://piyabute.s3.ap-southeast-1.amazonaws.com/notebook/sales_data_1000.csv\"\n", "\n", "# Step 3: Open the URL and read the content\n", "data = []\n", "\n", "with urllib.request.urlopen(url) as response:\n", " lines = response.read().decode('utf-8').split('\\n')\n", "\n", " # Step 4: Split the header line\n", " headers = lines[0].strip().split(',')\n", "\n", " # Step 5: Split each subsequent line and collect data\n", " for line in lines[1:]:\n", " if line.strip(): # Skip any empty lines\n", " row = line.strip().split(',')\n", " data.append(row)\n", "\n", "# Step 6: Convert data to numpy array for easier manipulation\n", "data = np.array(data)\n", "\n", "# Step 7: Print the first 5 rows (including headers) to verify the data\n", "print(\"First 5 rows of data:\")\n", "print(headers)\n", "print(data[:5])" ] }, { "cell_type": "code", "execution_count": null, "id": "3c946069", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 5 }