ftp.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2002/04/03/04:15:17

X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f
Lines: 56
X-Admin: news AT aol DOT com
From: sterten AT aol DOT com (Sterten)
Newsgroups: comp.os.msdos.djgpp
Date: 03 Apr 2002 09:11:06 GMT
References: <3CAAB61B DOT FC481AF4 AT is DOT elta DOT co DOT il>
Organization: AOL http://www.aol.com
Subject: Re: help with inline AT&T assembly
Message-ID: <20020403041106.20871.00001341@mb-mm.aol.com>
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:
 >Sterten wrote:
 >> and then I had several errors , I don't like the AT&T-syntax.
 >Many people disagree with you (they don't like the Intel syntax).

well, there are some objective measurements. One of them is
source size , and mere typing time.
Typing all the "%" and "(" , which need a shift is awful.
I don't like the Intel syntax either. I don't understand why
we are supposed to write "mov eax,ebx" or even worse :
"movl %%ebx,%%eax" instead of just A=B. 
And who knows what "punpckldq" means ? ;-)
Maybe that's the reason why most people don't like assembly ?
I'd like to view C as an assembler with "macros" , but I'd have
to be able to predict the exact opcode being generated.
And all assembly commands should be part of the C-language.
BTW. can I specify which register to use for a C-variable and
can I change this during the program ?

 >> Here is , what finally worked but gave only a speed improvement
 >> of about 30% on my K6/2 :
 >As a rule of thumb, you shouldn't expect any speedups more than 30-50% from
 >going to assembly.  

usually I get more. This program was originally 250sec , now it's 49sec
due to algo-changes and code optimization , but each improvement
only gave 10%-30%.
Using 3 register variables already helped a lot. (I usually don't
use register variables , maybe I'll use more in future)
I could still unroll the loop to avoid stalls for another
estimated 20% . This is all for the K6/2 , maybe it's
better on newer processors.

 >If you need a larger speedup, you should rethink your

yes, of course. But I want to measure the algorithms by their speed ,
so it does make sense to optimize them before comparing them.
And then the performance also depends on the instances and
one algo is better with one instance while another one is
better on another.

I recently had a program, which was only half as fast with GCC/djgpp
than with other compilers :-(
Usually  GCC/djgpp 's code is about 30% slower than e.g. 
code from the Intel compiler.
Does this coincide with what other people experienced ?


- Raw text -

  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019