14 Sep Matrix Addition and Matrix Multiplication

Posted at 17:35h in Computer Science by

Please find the requirement in the attached document.

Need this assignment by wednesday 9/14, 10:00pm

No copy and paste, Plagiarism results in course termination.

Requirement.doc

Project # 1 – Matrix Addition and Matrix Multiplication

Task 1: Basic Matrix Addition

For this task, you will develop a complete CUDA program for integer matrix addition. You will add two two-dimensional matrices A, B on the device GPU in parallel. After the device matrix addition kernel function is invoked, and the addition result will be transferred back to the CPU. Your program will also compute the sum matrix of matrices A and B using the CPU. Your program should compare the device-computed result with the CPU-computed result. If it matches, then it will print out "Test PASSED" to the screen before exiting.

The pseudo code for matrix addition on the CPU is as follows:

void add_matrix_cpu(int *a, int *b, int *c, int N)

{

int i, j, index;

for (i=0;i<N;i++) {

for (j=0;j<N;j++) {

index =i*N+j;

c[index]=a[index]+b[index];

}

void main() {

…..

add_matrix(a,b,c,N);

}

The pseudo code for matrix addition on the GPU device is as follows:

CUDA C program

__global__ void add_matrix_gpu(int *a, int *b, int *c, intN)

{

int col =blockIdx.x*blockDim.x+threadIdx.x;

int row=blockIdx.y*blockDim.y+threadIdx.y;

int index =row*N+col;

if( row<N && col <N)

c[index]=a[index]+b[index];

}

void main() {

dim3 dimBlock(blocksize, blocksize,1);

dim3 dimGrid( ceiling (double (N) /dimBlock.x), ceiling (double (N) /dimBlock.y), 1 );

add_matrix_gpu<<<dimGrid, dimBlock>>>(a,b,c,N);

}

Use the following pseudo code for matrix initialization.

int *a, *b, *c;

A=malloc(sizeof(int)*N*N; //N is the size

//then malloc for b and c

Int init =1325;

For (i=0;i<N;i++){

For (j=0;j<N;j++){

Init=3125*init%65536;

A[i][j]=(init-32768)/6553;

B[i][j]=Init%1000;

}

Use the following matrix size and thread block size (the number of threads in each block) to test your cuda program.

Matrix Size	Size of Thread block
8*8	4*4 (For debugging purpose)
128*128	16*16
500*500	16*16
1000*1000	16*16

Task 2: Matrix Multiplication

For this task, you will develop a complete CUDA program for matrix multiplication. You will multiply two two-dimensional matrices A,B on the device GPU in paralell. After the device matrix multiplication kernel function is invoked, and the multiplication result will be transferred back to the CPU. Your program will also compute the product matrix of matrices A and B using the CPU. Your program should compare the device-computed result with the CPU-computed result. If it matches, then it will print out "Test PASSED" to the screen before exiting.

The pseudo code for matrix multiplication on the CPU is as follows:

void MatrixMulOnHost(int* M, int* N, int* P, int Width)‏

{

for (int i = 0; i < Width; ++i)‏

for (int j = 0; j < Width; ++j) {

int sum = 0;

for (int k = 0; k < Width; ++k) {

int a = M[i * Width + k];

int b = N[k * Width + j];

sum += a * b;

}

P[i * Width + j] = sum;

}

void main() {

…..

add_matrix(a,b,c,N);

}

The pseudo code for matrix addition on the GPU device is as follows:

CUDA C program

__global__ void MatrixMulKernel(int* M, int* N, int * P, int Width)

{

int Row = blockIdx.y*blockDim.y+threadIdx.y;

int Col = blockIdx.x*blockDim.x+threadIdx.x;

if ((Row < Width) && (Col < Width)) {

int Pvalue = 0;

for (int k = 0; k < Width; ++k)

Pvalue += M[Row*Width+k] * N[k*Width+Col];

d_P[Row*Width+Col] = Pvalue;

}

void main() {

dim3 dimBlock(blocksize,blocksize,1);

dim3 dimGrid( ceiling (double (N) /dimBlock.x), ceiling (double (N) /dimBlock.y), 1 );

add_matrix_gpu<<<dimGrid,dimBlock>>>(a,b,c,N);

}

Use the following pseudo code for matrix initialization.

int *a, *b, *c;

A=malloc(sizeof(int)*N*N; //N is the size

//then malloc for b and c

Int init =1325;

For (i=0;i<N;i++){

For (j=0;j<N;j++){

Init=3125*init%65536;

A[i,j]=(init-32768)/6553;

B[i,j]=Init%1000;

}

Use the following matrix size and thread block size (the number of threads in each block).

Matrix Size	Size of Thread block
8*8	4*4 (For debugging purpose)
128*128	16*16
500*500	16*16
1024*1024	16*16

Requirements:

1. In order to use the cuda compiler environment installed under the cs unix server, fry.cs.wright.edu, you need to connect to this unix server remotely using a secure shell client, such as putty. You can remotely connect to this unix server, fry.cs.wright.edu, on campus from a Wright State computer or use your own laptop connecting to the WSU wifi network named “WSU-Secure”. Note that you cannot remotely connect to this computer using ssh using computers outside Wright State University without installing VPN or use the campus “WSU_EZ_CONNECT” wifi network. If you want to connect to this server remotely off campus, you need to install VPN on your computer first. If you want to edit your cuda source programs under windows, download notepad++. Then edit your source programs using notepad++. After you finish editing the cuda source programs, using the secure file transfer client (WinSCP, you can download it online, and install it on your personal computer) to transfer your cuda source programs to fry.cs.wright.edu.

2. You must submit an ELECTRONIC COPY of your source program through Pilot before the due date. If for some reason Pilot is unavailable, submit your source code by email to [email protected]

3. Submit all your source codes, a README file, a report, and any other required files. It is required that you explain how to compile and run your programs clearly in the README file. In the report, please report whether your programs have all the functionalities required in the project description. In your report, please state clearly any functionalities not implemented in your program. If your program works correctly, please include screenshots in your report. Your submitted file name should use your last name as part of the file name, for example, Liu_Project1.cpp, Liu_Project1_Report, Liu_Project1_ReadMe, etc. All the submitted project files should have: Course Number / Course Title, your name, group member’s name, prof.’s name, date, and the project name. If you did not include these required contents in your submitted files, then 5 points will be deducted.

4. The grader or the instructor will test your programs under CUDA environment, on the linux server, fry.cs.wright.edu. Before you submit your program, please connect to this server using your campus ID to test your program (I have demoed how to compile and execute a cuda program on this server. If you have questions, let me know).

5. The programming assignment is individual. You must finish the project by yourself. If you allow others to copy your programs or answers, you will get the same punishment as those who copy yours.

How to use CUDA on fry.cs.wright.edu

First using putty or other secure shell clients to connect to fry.cs.wright.edu using your campus id (for example, w123abc), then run the following command:

srun -p a100 –gres=gpu:1 –pty bash

This command will request access to a gpu node and launch a bash shell on it.

Then you can compile a cuda program vectadd.cu using the following command under the directory where your source cuda program is located.

nvcc vectadd.cu -o vectadd

Then you can execute vectadd using the following command under the directory where the generated executable file (of your cuda source program), vectadd, is located.

./vectadd

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteDemy. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.

Do you need an answer to this or any other questions?

About Wridemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

To make an Order you only need to click on “Order Now” and we will direct you to our Order Page. Fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Are there Discounts?

All new clients are eligible for 20% off in their first Order. Our payment method is safe and secure.

14 Sep Matrix Addition and Matrix Multiplication

Project # 1 – Matrix Addition and Matrix Multiplication

About Wridemy

We are a professional paper writing website. If you have searched a question and bumped into our website just know you are in the right place to get help in your coursework. We offer HIGH QUALITY & PLAGIARISM FREE Papers.

How It Works

Are there Discounts?

Hire a tutor today CLICK HERE to make your first order

Related Tags

About us

Quick help

Subjects covered