We see that this computes the product of two matrices. Add a new kernel code, called...

Question

Question

We see that this computes the product of two matrices. Add a new kernel code, called...

We see that this computes the product of two matrices. Add a new kernel code, called sum, to compute the sum of the two matrices.

#include <stdio.h>
#include <math.h>
#include <sys/time.h>

#define TILE_WIDTH 2
#define WIDTH 6

// Kernel function execute by the device (GPU)
__global__ void
product (float *d_a, float *d_b, float *d_c, const int n) {
   int col = blockIdx.x * blockDim.x + threadIdx.x ;
   int row = blockIdx.y * blockDim.y + threadIdx.y ;

   float sum = 0;
   if (row < n && col < n) {
      for (int i = 0 ; i<n ; ++i) {
         sum += d_a[row * n + i ] * d_b[i * n + col] ;
      }
      d_c[row * n + col] = sum;
   }
}

// Utility function to print the input matrix
void printMatrix (float m[][WIDTH]) {
   int i, j;
   for (i = 0; i<WIDTH; ++i) {
      for (j = 0; j< WIDTH; ++j) {
         printf ("%d\t", (int)m[i][j]);
      }
      printf ("\n");
   }
}

// Main function execute by the host (CPU)
int main () {
   // host matrices
   float host_a[WIDTH][WIDTH],
         host_b[WIDTH][WIDTH],
         host_c[WIDTH][WIDTH];

   // device arrays
   float *device_a, *device_b, *device_c;

   int i, j;

   // initialize host matrices using random numbers
   time_t t;
   srand ((unsigned) time(&t));

   for (i = 0; i<WIDTH; ++i) {
      for (j = 0; j<WIDTH; j++) {
         host_a[i][j] = (float) (rand() % 50);
         host_b[i][j] = (float) (rand() % 50);
      }
   }

   printf ("Matrix A:\n");
   printMatrix (host_a);
   printf ("\n");

   printf ("Matrix B:\n");
   printMatrix (host_b);
   printf ("\n");

   // allocate device memory for input matrices
   size_t deviceSize = WIDTH * WIDTH * sizeof (float);
   cudaMalloc ((void **) &device_a, deviceSize);
   cudaMalloc ((void **) &device_b, deviceSize);

   // copy host matrices to device
   cudaMemcpy (device_a, host_a, deviceSize, cudaMemcpyHostToDevice );
   cudaMemcpy (device_b, host_b, deviceSize, cudaMemcpyHostToDevice );

   // allocate device memory to store computed result
   cudaMalloc((void **) &device_c, deviceSize) ;

   dim3 dimBlock (WIDTH, WIDTH);
   dim3 dimGrid (WIDTH/TILE_WIDTH, WIDTH/TILE_WIDTH);
   product<<<dimGrid, dimBlock>>> (device_a, device_b, device_c, WIDTH);

   // copy result from device back to host
   cudaMemcpy (host_c, device_c, deviceSize, cudaMemcpyDeviceToHost);

   // output the computed result matrix
   printf ("A x B: \n");
   printMatrix (host_c);

   cudaFree (device_a);
   cudaFree (device_b);
   cudaFree (device_c);
   return 0;
}

Engineering Computer-Science

0 0

Add a comment Transcribed image text

Answer 1

Answer #1

Sum function code:

// Kernel function execute by the device (GPU)
__global__ void
sum (float *d_a, float *d_b,float *d_d,const int n) {
int i,j;
for (i = 0; i<n; ++i) {
      for (j = 0; j<n; j++) {
         d_d[i][j] = d_a[i][j] + d_b[i][j];
      }
   }
}

In main function add these lines:

dim4 dimBlock (WIDTH, WIDTH);
dim4 dimGrid (WIDTH/TILE_WIDTH, WIDTH/TILE_WIDTH);
sum<<<dimGrid, dimBlock>>> (device_a, device_b, device_d, WIDTH);

// copy result from device back to host
   cudaMemcpy (host_d, device_d, deviceSize, cudaMemcpyDeviceToHost);

   // output the computed result matrix
   printf ("A + B: \n");
   printMatrix (host_d);

0 0

Add a comment

We see that this computes the product of two matrices. Add a new kernel code, called...

Homework Answers

Post as a guest

Earn Coins

Not the answer you're looking for?

Similar Questions

This is C programming assignment. The objective of this homework is to give you practice using...

It is N queens problem please complete it use this code //*************************************************************** // D.S. Malik //...

IN C PROGRAMMING A Tv_show structure keeps track of a tv show’s name and the channels...

It is about C++linked list code. my assignment is making 1 function, in below circumstance,(some functions...

For the following code in C, I want a function that can find "america" from the...

Please answer the following C question: Read the following files called array-utils5A.c and array-utils5A.h. Build an...

Using the C programming language implement Heapsort in the manner described in class. Here is some...

Q3) Write a function that takes two arrays and their size as inputs, and calculates their...

Write a method that returns the sum of all the elements in a specified column in...

For a C program hangman game: Create the function int setup_game [int setup_game ( Game *g,...

Need Online Homework Help?

Active Questions