What is the fastest way to transpose a matrix in C

0 votes

I have a reasonably large matrix that I need to transpose. 

Assume, for example, that my matrix is

a b c d e f
g h i j k l
m n o p q r 

I want the result be as follows:

a g m
b h n
c I o
d j p
e k q
f l r

What is the fastest way to do this?

Jul 15, 2022 in C++ by Nicholas
• 7,760 points
770 views

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

This is an excellent question. 

There are several reasons why you might wish to transpose the matrix in memory rather than merely swap coordinates, such as matrix multiplication and Gaussian smearing.

First, let me list one of the transposition routines I employ.

void transpose(float *src, float *dst, const int N, const int M) {
    #pragma omp parallel for
    for(int n = 0; n<N*M; n++) {
        int i = n/N;
        int j = n%N;
        dst[n] = src[M*j + i];
    }
}

Now let's see why the transpose is useful. Consider matrix multiplication C = A*B. We could do it this way.

for(int i=0; i<N; i++) {
    for(int j=0; j<K; j++) {
        float tmp = 0;
        for(int l=0; l<M; l++) {
            tmp += A[M*i+l]*B[K*l+j];
        }
        C[K*i + j] = tmp;
    }
}

That way, however, is going to have a lot of cache misses. A much faster solution is to take the transpose of B first

transpose(B);
for(int i=0; i<N; i++) {
    for(int j=0; j<K; j++) {
        float tmp = 0;
        for(int l=0; l<M; l++) {
            tmp += A[M*i+l]*B[K*j+l];
        }
        C[K*i + j] = tmp;
    }
}
transpose(B);

Matrix multiplication is O(n3) and transpose is O(n2), taking the transpose should have no influence on calculation time (for large n). Loop tiling is more effective than obtaining the transpose in matrix multiplication, but it is considerably more difficult.

answered Jul 22, 2022 by Damon
• 4,960 points

edited Mar 5

Related Questions In C++

0 votes
1 answer

What is the best way to use a HashMap in C++?

The ordered and unordered map containers (std::map and std::unordered map) are included in the standard library.  The items in an ordered map are sorted by key, and insert and access are in O (log n).  For ordered maps, the standard library often use red black trees.  However, this is only an implementation detail.  Insert and access are in O in an unordered map (1).  It is simply another term for a hashtable. An illustration using (ordered) std::map: #include <map> #include <iostream> #include <cassert> int main(int argc, char ...READ MORE

answered Jun 10, 2022 in C++ by Damon
• 4,960 points
1,085 views
0 votes
0 answers

What is the C++ function to raise a number to a power?

What's the best way to raise a n ...READ MORE

Jun 1, 2022 in C++ by Nicholas
• 7,760 points
663 views
0 votes
0 answers

What is the easiest way to initialize a std::vector with hardcoded elements?

I can make an array and initialise&nb ...READ MORE

Jun 27, 2022 in C++ by Nicholas
• 7,760 points
670 views
0 votes
1 answer

Is there an easy way to make a min heap in C++?

Use make heap() and its buddies from algorithm>, or priority queue from queue>.  Make heap and friends are used by priority queue. #include <queue> // functional,iostream,ctime,cstdlib using namespace std; int main(int ...READ MORE

answered Jun 27, 2022 in C++ by Damon
• 4,960 points
1,240 views
0 votes
0 answers

What is the difference between cout, cerr, clog of iostream header in c++? When to use which one?

I looked up the differences between cout, ...READ MORE

Jul 27, 2022 in C++ by Nicholas
• 7,760 points
834 views
0 votes
1 answer

Are virtual functions the only way to achieve Runtime Polymorphism in C++?

fprintf is a polymorphism function in the C programming language. It can print to a file, stdout, a printer, a socket, or whatever else the system can represent as a stream if you supply it different handles. FILE* file = fopen("output.txt", "w"); ...READ MORE

answered Jun 21, 2022 in C++ by Damon
• 4,960 points
594 views
0 votes
1 answer

Why would anyone use set instead of unordered_set?

Unordered sets must compensate for their O(1) ...READ MORE

answered Jun 1, 2022 in C++ by Damon
• 4,960 points
3,193 views
0 votes
1 answer

What is the difference between std::__gcd and std::gcd?

I done some research about this. The ...READ MORE

answered Jun 10, 2022 in C++ by Damon
• 4,960 points
1,947 views
0 votes
1 answer

Lower and Upper Bound in case of Decreasing/Non-ascending vector

Both std::lower bound and std::upper bound must have an increasing (non-decreasing) order as their objective. By giving a comparator as the 4th parameter of the functions, you may modify the meaning of "growing." To work with descending vectors, use std::greater. #include<iostream> #include<vector> #include<algorithm> #include<functional> using namespace std; int main() { ...READ MORE

answered Jun 14, 2022 in C++ by Damon
• 4,960 points
1,478 views
0 votes
1 answer

How do i apply lower_bound to a range of unsorted vector elements?

What's the point of sorting your array? ...READ MORE

answered Jun 15, 2022 in C++ by Damon
• 4,960 points
1,988 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP