AMD to ship Athlon 64s as Athlon XPs

Eugenia Loli 2003-08-22 AMD 79 Comments

AMD’s upcoming Athlon 64 low-end variants, codenamed ‘Paris’ and ‘Victoria’, will not be offered as 64-bit processors but as 32-bit upgrades to the current Athlon XP line. So claims Xbit Labs, having glanced at the chip maker’s latest roadmaps.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

79 Comments

2003-08-22 2:42 pm

Anonymous
hmmm…

Let me get out the soldering iron. I feel an excellent mod comming up soon.
2003-08-22 2:43 pm

Anonymous
I thought I read somewhere that that is incorrect.

Just the pin count between them will be the same. Athlon XP will have increased pin count similar to the Athlon 64.
2003-08-22 2:52 pm

Anonymous
There is a reason they’re called Althlon 64s and that is because they are 64 bit; it would be idiotic to do otherwise. http://www.amdzone.com have pointed out that this is wrong. I don’t know what will happen to this news article though.
2003-08-22 2:58 pm

Anonymous
I’ve heard that IBM has jumped on board the AMD64 gravy train by announcing, once the AMD64 is released, Intellistations with AMD64 CPUs at their core. It would be great if more companies jumped on board, however, the only two companies are willing Intel whores than providing a good product.

As for the current release, Opteron, it would have been nice had there been a reasonably priced motherboard from a reputable manufacturer for the desktop market around a month ago. Had there one existed I would have considered buying and assembling an Opteron machine.

The only motherboards from what I saw around a month ago were server boards priced at around AUS$1800, which is way too much.
2003-08-22 3:07 pm

Anonymous
The Register is saying the AMD64 can be made to run in 64-bit, 64/32 mixed, or legacy mode. I assume this is something done during fabrication? They don’t make this point clear. They say that these legacy 32-bit mode versions of AMD64 could be rebranded as Athlon XP’s, because that would allow AMD to focus on one architecture instead of two. But then they point out that instead, AMD will apparantly be marketing the AMD64 separate from the AthlonXP for a while, and they say:

“The downside with this tactic is that it ultimately reduces the number of shipping 64-bit systems in the field – by allowing buyers who might choose an Athlon 64 to stick with XP – and thus make it harder for software developers to justify porting their 32-bit apps to AMD64.”

But unless the AMD64 can be configured in the BIOS or with a jumper setting to switch from legacy, to mixed, to pure 64-bit mode… isn’t this argument wrong? If they re-brand AMD64 in legacy mode as AthlonXP, it’s the same as a 32-bit chip, right?
2003-08-22 3:14 pm

Anonymous
Look at pricewatch right now – I was pricing an opteron system this morning. You can get an opteron 242+Mobo Combo for $700 now. Not too shabby I think….

http://www.pricewatch.com/1/306/5512-1.htm

Derek
2003-08-22 3:21 pm

Anonymous
Rumor mill?
2003-08-22 3:26 pm

Anonymous
If you would read elsewhere on the web you will see that what is happening is AMD is going to release an Athlon XP with 940 pins (the same amount as an Athlon64 FX with Dual Channel DDR). The word is that the chip will perform slightly better using the extra pins – but it is still a 32Bit chip – and is in no way an athlon 64 (no memory controller embeded – and no 64bit support) – it is just an athlon XP in a different package….

My thoughts are that they just want to move to a standard packaging – probably cheaper for them and cheaper for mobo makers who only have to deal with one kind of socket.

Just my 2 cents.

Derek
2003-08-22 3:27 pm

Anonymous
They are moving it to 754 pins NOT the 940 I stated in my post. Sorry.

754 pins is the pin count of the lowest “comsumer” Athlon64.

Derek
2003-08-22 3:32 pm

Anonymous
I’m very confuzzled. As if 3 new sockets weren’t bad enough now we’re also getting non 64bit variants … What on earth is the use of all this shit? If this is true then I hope that someone comes up with a little convertor that can put an athlonXP in one of the new socketed motherboards .. I can’t be arsed forking up that much money in one go.
2003-08-22 3:35 pm

Anonymous
But unless the AMD64 can be configured in the BIOS or with a jumper setting to switch from legacy, to mixed, to pure 64-bit mode… isn’t this argument wrong? If they re-brand AMD64 in legacy mode as AthlonXP, it’s the same as a 32-bit chip, right?

As far as I understand it, Athlon64s will start in real mode and act like and 8086, just like Pentiums do. Switching to 32-bit or 64-bit mode will be done by software (such as bootloaders, etc.)
2003-08-22 3:41 pm

Anonymous
I think AMD has really spoiled the excellent AMD64 technology with a confusing, oft-changing marketing and branding policy. In the end, having so many names / pin layouts / model numbering schemes just confuses everyone and dilutes the impact of what should be a really powerful technology.

Personally, I’d love an Athlon64 machine, but I will obviously wait until they sort out their marketing, and indeed the chips themselves. Anyway, let’s hope they can ramp up the clock speeds and make the 64-bit chips affordable enough that no-one will want to buy the 32-bit ones any more.

They deserve to “win” with the AMD64 technology, but I hope their marketing people haven’t fumbled the whole thing up for them!
2003-08-22 3:53 pm

Anonymous
This is false.

There will be Athlon xp with 754 pins, but they are not “cut down athlons 64”.
2003-08-22 3:53 pm

Anonymous
Operating mode is not decided at factory or by a jumper on the motherboard. It is available to the system programmer. 64bit native mode is enabled in a procedure similar to how 32bit native mode has been enabled for every CPU since de 386. You always boot up in 16bit real mode. Then you issue assembler commands in your bootloader or kernel initialisation routine to switch to the desired mode.

The procedure is explained and illustrated in detail in the AMD64 Architecture Programmer’s Manuel VOlume 2 : System Programming, on pages 12 through 20.
2003-08-22 3:56 pm

Anonymous
This sounds pretty strange. I hope it isn’t true.

Maybe they don’t want the Athlon64 to compete with Opteron. They currently have pretty good margins on the Opteron parts. The 64-bit capability won’t really be a mass market selling point until MS comes out with a 64-bit version of their “Home” operating system. Maybe they don’t want to give blade manufacturers a cheap part that can run 64-bit code. They would probably prefer that these manufacturers have to use the more expensive Opteron.

Hopefully, this won’t apply to the dual channel (socket 939) Athlon64s. It could be that they are angling to make socket 754 their low end processor of the future. By removing 64-bit capabilities from the low end processor, they provide more reason to buy the higher end Athlon64 or Opteron (aside from the dual channel capabilities).

RE: Thomas

I may be misreading your post, but I think you may be misunderstanding things. The Opteron has 3 software modes (legacy, mixed, full 64-bit). These modes are switched by software running on the chip. The rumor here is that AMD may be removing the ability to run in mixed or full 64-bit modes from the initial Athlon64 parts. This could either be done through modifications to the core (unlikely IMO), through modifications in the packaging of the core (whereby the 64-bit core is “configured” to run 32-bit only). I doubt there will be an easy way to “hack” it back to 64-bit capable, but who knows.
2003-08-22 4:03 pm

Anonymous
Look at pricewatch right now – I was pricing an opteron system this morning. You can get an opteron 242+Mobo Combo for $700 now. Not too shabby I think….

2003-08-22 4:19 pm

Anonymous
Isn’t this like the ‘mode bit’ for the Eclipse discussed in ‘Soul of a New Machine’? I think the conclusion there was that a mode bit is a kludge…….

Things have changed a little since the DEC days though.

2003-08-22 4:52 pm

Anonymous
I feel better about Apple’s solution for 64 bit. At least the game plan is more clear.

2003-08-22 4:57 pm

Anonymous
…system…

I would toss a nearly sickening amount of RAM, the best video card and sound you can buy… Then I would run it exclusively with a 64-bit Linux Kernel based distro.

Then, I would pick up one of those nice “Run Windows within Linux” packages and be all set…

This day will come soon…

2003-08-22 5:01 pm

Anonymous
What’s that? A new “Victoria Paris” CPU? Which fembot are they using it in? Go AMD!

2003-08-22 5:22 pm

Anonymous
Seems to indicate it will be 64 bit.

AMD will launch the next AMD64 processor, the AMD Athlon™ 64 processor – the first and only 64-bit processor for Windows®-based desktop and mobile PCs – on September 23, 2003.

2003-08-22 5:34 pm

Anonymous
The Eclipse is a machine made by Data General, not DEC (DEC is their competitor with the VAX).

http://www.simulogics.com/nostalgia/DG/Eclipse_01.jpg

For a nice picture of an eclipse, see here, and a previous story on osnews about a simulator

http://www.osnews.com/story.php?news_id=3015

Thats a great book, but sounds like you need to read it again a little closer

2003-08-22 6:30 pm

Anonymous
Hay yall I might go to linux full tim if I knew which disro will be 100x better than windoz 64bit. Which linux distro will be the best from the athlon 64 i plan to buy? Please note I left a real email address. Thanks, I enjoy osnews alot.

2003-08-22 6:46 pm

Anonymous
They’re just going to call low-end 64’s XP and make them “lie” to say they’re 32bit chips…

I think they’re right to switch the names..if they don’t “kill off” XP no body would buy the new 64. Also, many customers would see 64-bits and freak out thinking they had to buy all new software to even buy it. Funny even the chipmaker worries about how abusive software makers have become!

As long as they leave a “backdoor” and don’t permenently cripple the poor things they might just work out. After all, screwdriver shops [bulk of AMD vendors] will just slap it in as a faster XP and sell it. Tech heads will hack into it to make it be 64bits, making it wildly popular in “grassroots” way, kinda like Celeron 300A’s.

Also, there’s the Microsoft consideration. MS still doesn’t have a 64bit OS ready…AMD would like to sell chips now, I would suppose. This would calm the waters of the MS shops by not having a bunch of people showing up to buy Linux hardware! Otherwise that MS patch could be a long time comming…I’m sure Intel wouldn’t mind at all.

2003-08-22 7:20 pm

Anonymous
The 64 bit distros are pretty much identical to the 32 bit ones. It’s the same apps just re-compiled, so I’d find a distro you like now, and get used to it, rather than wait around. Suse, Red Hat both have 64bit Opertron optimised distros, though I think they are quite server oriented.

As far as 100 times better than windows goes… Well, you can use it now, and there are probably more apps that are easy to re-compile for 64 bit, rather than having to wait around for the closed source software companys to catch up, but in the end it depends on how much you like running Linux.

RE: 64/32

Thanks for the links, I’ll give it another read. I might even understand it this time.

2003-08-22 7:40 pm

Anonymous
When I read the posting It sounded like they were going to take an Opteron and clip the hyper transport and make it pin for pin compatiblie with the Athlon XP. Thus it would be a

32 bit addressing mode and some extra 64 bit chip.

Sounds like that would be a good plan.

I guess we will have to wait and see…

Donaldson

2003-08-22 8:56 pm

Anonymous
Paris caused the fall of Troja through the victory of the Greeks.

Does AMD want to tell us that the Athlon 64 will cause the fall of AMD and Intel’s victory?

Or is this only a bad attempt to have a nice codename?

2003-08-22 9:09 pm

Anonymous
I read somewhere that even though AMD64 will be ready by Sept Windows XP 64-bit for AMD64 will not be ready by then. In essence, since AMD is going after the consumer market with Athlon XP-64, they have to make it work with Windows XP.

It’s fine for people like me who uses Linux to get the full 64-bit benefit but marketing-wise AMD needs to make sure they’re going after the masses. I think that’s the most plausible reason that they’re going to market the Athlon XP-64 as 32-bit CPUs. Once Windows XP-64 comes out I am willing to bet that they will switch the whole marketing campaign around

2003-08-22 10:02 pm

Anonymous
I just noticed that the oscast.com site is back up and seems to be calling itself os-news too. Anyone else think its bad taste. I mean, they could have stuck to their old name. Or did they not know we existed?

2003-08-22 10:42 pm

Anonymous
They seem to be using the os-news.com domain (notice the hypen).

2003-08-22 11:52 pm

Anonymous
I feel better about Apple’s solution for 64 bit. At least the game plan is more clear.

Even though I like AMD, I have to agree. I also think Apple, IMB, and AMD will be the first winners in this whole 64 bit market. Intel will have to follow.

2003-08-22 11:57 pm

Anonymous
I think AMD has really spoiled the excellent AMD64 technology with a confusing, oft-changing marketing and branding policy. In the end, having so many names / pin layouts / model numbering schemes just confuses everyone and dilutes the impact of what should be a really powerful technology.

Alastair, I entirely agree with you. They are terrible at marketing, and deciding on products.

Even in the AMD64 family they have a different pin counts for at least one product. This way I can’t even use the same motherboard, if lets say I decide to go with the cheaper option at first and then upgrade to the slightly more expensive Athlon64, or Opteron.

I would be more specific about what I am talking about but AMD have indeed confused me about their offerings. It is not only their fault but also the internet media’s fault for this.

At least back in the days of Duron/Athlon. I knew if I bought a quality motherboard, but chose a cheaper CPU option such as the Duron I knew that I could upgrade to the Athlon later. A clear advantage over Intel.

2003-08-23 12:40 am

Anonymous
Hello,

I don’t think this is going to be much of an issue for AMD, while as an hobbiest I hate it, very few end users upgrade their CPU’s. In 15 years in the corporate and hooby world I have CPU’s upgraded twice. I did it once for an ALpha cpu and at work we did it on an 12 processor sun box.

AMD seems to have 2 market segementation going. Their is the bussiness chips (read ECC) and non bussiness. This “32 bit” chip is clearly going to end up as the duron of the opteron family. (sooner or later it will be.) The real question about this nerws leak is?

Q) Is AMD chopping off the 64 bit instructions?

I doubt this greatly, mucking with a die is already a good way to lose all of your hair and wish for death.

Q) Could AMD be mucking with the memory controller?

I think this is more likely I mean they have it as a seprate unit on die they could reduce pin count and force addressing down to 32 bits. THis would be my guess.

As a side note I have yet to see a chip with full 64 bit addressing. The most I have seen is 48 bit. The top bits are set to zero. So clipping the internal adress bus would eleminate at least 16 lines of logic and should make the cpu benchmark better for 32 bit apps. I’m guess ing the 64 bit instructions will be present but would suffer from the cramped memory bus.

Till it ships…

Donaldson

2003-08-23 2:32 am

Anonymous
>>Seems to indicate it will be 64 bit.

>> AMD will launch the next AMD64 processor, the AMD Athlon™ 64

>>processor – the first and only 64-bit processor for Windows®-based

>>desktop and mobile PCs – on September 23, 2003.

Good to know that they published that IN THE FUTURE. September 23, 2003 is over a month away from today!

2003-08-23 5:17 am

Anonymous
Dell has little reason to jump abroad the AMD train right now. You see, AMD is loosing market share, they are loosing the speed race, they are not coming out with newer processors. Now did you notice big OEMs that actually carry AMD processor (currently, HP) – they don’t advertise Athlons like they advertise Intel (i.e. “NEW”, “FAST” and whatnot).

Maybe AMD would do great for themselves, but Intel is commited to moving from x86 to another architecture – IA64. Itanium may currently be for high-end servers and workstations, but like Pentium Pro, I doubt for much longer. As soon Intel comes out with the promised near-native speed emulation, they are ready to go into the high-end consumer market.

2003-08-23 5:36 am

Anonymous
What’s all this 32-bit crap? First, it was the articles that the athlon64 was only going to get released in very limited quantities, and now this – a 32bit variant of the 64bit chip. What a joke. It sounds like AMD is kissing microsoft’s ass by delaying the chip until MS can move their elephantine bowels and drop a 64bit turd of an os on us all. Pathetic.

The cpu speeds, lately, seem to be increasing painfully slow – like a snail on a bed of nails. And why only 64 bits? Why not just go for a 256 bit chip and skip all the useless drivel inbetween!!?!? Doesn’t anyone INNOVATE any more???? Doesn’t even one company believe in making a better product instead of pedalling the same old shit with some miniscule improvement? And i want a boom box that’ll play both mp3’s and ogg files. Where the hell is it??? DO IT NOW!!!! I’ll but it!!!

2003-08-23 6:32 am

Anonymous
And why only 64 bits? Why not just go for a 256 bit chip and skip all the useless drivel inbetween!!?!? Doesn’t anyone INNOVATE any more

I would say “Why go to 64-bit now?”. Perhaps you don’t understand – going 256bit with today’s technologies is not only utterly imposible, but also utterly useless. 64-bit poses no barriers we need to cross as of now. Besides, 256-bit would not cause speed to be 4x higher, rather perhaps even slower.

2003-08-23 8:11 am

Anonymous
Jebus, what a fanboy load that is. They even link to David “Mackido/Winodws NT has DOS roots” Every, whose inane zealotry eventually got him run out of the ArsTechnica forums.

2003-08-23 8:33 am

Anonymous
http://www.os-news.com/themes/SeaBreeze/images/logos/os-newslogo-sl…

So why don’t they call it tec-news, tek-news, tech-news

2003-08-23 9:30 am

Anonymous
I would say “Why go to 64-bit now?”. Perhaps you don’t understand – going 256bit with today’s technologies is not only utterly imposible, but also utterly useless. 64-bit poses no barriers we need to cross as of now. Besides, 256-bit would not cause speed to be 4x higher, rather perhaps even slower.

Because x86-64 doubles the number of general purpose registers (16, instead of 8 on x86). This allows for better optimizations and thus faster code.

Forget the extra addressable RAM as most people don’t need it yet. One of the biggest problems with x86 is the lack of registers available to the programmer. x86-64 helps in solving this problem.

2003-08-23 12:51 pm

Anonymous
1. Opteron is a server chip; you won’t find any low-priced enthusiast motherboards for it, so stop looking and stop whining.

2. Athlon 64 is the consumer 64-bit chip to be released late this year, and AMD has made this clear in many press releases. The only thing “confuzzled” here are people’s unfounded expectations.

3. The Register is a rumor mill. It’s interesting and fun, but take what you read there with a grain of salt.

4. Athlon 64 will be able to run in legacy mode so that it can run the 32-bit version of Windows XP that’s currently selling in stores, along with all your current 32-bit software and drivers (except certain chipset drivers, obviously).

5. When Microsoft releases a version of Windows XP for AMD 64, then you can buy the OS and run the chip in 64-bit mode. 64/32 is most likely just a mode switch that can be done for compatibility with 32-bit programs. Of course, you could just run an AMD64-based Linux distro instead of waiting.

Because x86-64 doubles the number of general purpose registers (16, instead of 8 on x86). This allows for better optimizations and thus faster code.

That has nothing to do with rajan r’s point. These extra registers could easily have been added to a 32-bit chip as a set of extensions; his point was that the move to 64-bit addressing does not automatically make everything twice as fast. As far as the Athlon 64 being a better chip than the currently available Athlon XPs, you’re absolutely correct.

2003-08-23 3:57 pm

Anonymous
Because x86-64 doubles the number of general purpose registers (16, instead of 8 on x86). This allows for better optimizations and thus faster code.

Not nessecarily. Unless the program is large, like Photoshop or AutoCAD, which isn’t everyday applications, the amount of registers wouldn’t increase speed dramatically, if speed increase in the first place. But development time would increase because for obvious reasons.

Besides, you don’t have to completely move to 64-bit to offer 64-bit features. In todays 32-bit x86 processors, there is a lot of things not 32-bit. The same with the 286, a lot of features are actually 32-bit. The processor itself may not be 64-bit, but heck, we don’t need 64-bit now. And those who actually do already are using 64-bit.

Wee Jin Goh: Forget the extra addressable RAM as most people don’t need it yet. One of the biggest problems with x86 is the lack of registers available to the programmer. x86-64 helps in solving this problem.

Did I even mention RAM? 64-bit goes a whole way lot more than addressable RAM, buddy.

null_pointer_us: That has nothing to do with rajan r’s point. These extra registers could easily have been added to a 32-bit chip as a set of extensions; his point was that the move to 64-bit addressing does not automatically make everything twice as fast. As far as the Athlon 64 being a better chip than the currently available Athlon XPs, you’re absolutely correct.

Aww, null_pointer_us, you spoiled my chance on debunking this nice bloke here…

2003-08-23 4:22 pm

Anonymous
Not nessecarily. Unless the program is large, like Photoshop or AutoCAD, which isn’t everyday applications, the amount of registers wouldn’t increase speed dramatically, if speed increase in the first place. But development time would increase because for obvious reasons.

There will be a speed increase. It will be arguable whether the increase will be noticable in simple business apps, but for everything else (games, multimedia apps, Seti, etc) it will be noticable. You get many reports on the web when people recompile for Opteron, they get an approximate 30% speed increase. That’s mainly due to the increased number of registers, and the on-die memory controller.

The reason development time would increase isn’t obvious to me. Please explain.

In todays 32-bit x86 processors, there is a lot of things not 32-bit. The same with the 286, a lot of features are actually 32-bit. The processor itself may not be 64-bit, but heck, we don’t need 64-bit now. And those who actually do already are using 64-bit.

I’m sorry, but I do not understand what you are saying. What are you referring to specifically? An int is 32 bits, and most programs make use of them. What ‘lot of features’ on a 286 are 32 bit? But I do agree that most of todays programs don’t make use of 64 bit integers, so the move to 64 bits wouldn’t benefit them.

2003-08-23 4:31 pm

Anonymous
That has nothing to do with rajan r’s point. These extra registers could easily have been added to a 32-bit chip as a set of extensions; his point was that the move to 64-bit addressing does not automatically make everything twice as fast. As far as the Athlon 64 being a better chip than the currently available Athlon XPs, you’re absolutely correct.

They could have been added as an extension to a 32 bit processor, but they aren’t. AMD has added them to their line of 64 bit processors. Most people I know aren’t too interested in the Opteron for its 64 bit instruction set/address space. Rather, we are more keen on the performance increase we will get due to the increase in GPRs, and on-die memory controller.

2003-08-23 4:33 pm

Anonymous
It just struck me, are you null_pointer from the GameDev.net message boards?

2003-08-23 5:08 pm

Anonymous
From the alpha world…

64 bit may or may not mean speed. Sure your integers are now

8 bytes (64 bits) but that means every time you fetch one

from main memory you’re pulling twice the data. No longer

can you just say (void*) = (int). If all of you’re pointers

are 64 bits and you’re integers may be 32 bits.

The real speed comes in a few areas. 1. is floating point.

Since all floating point these days is 48 bits or greater

(usually top out at 128 bits) the ability to fetch in 64 bit

chunks really helps this out. (3D game engines??).

2. Internal data movement.. DMA with 64 bits.

3. When playing with strings you can now rip in 8 char in

stead of 4 char. (This usually helps)

4. Filesystems can get 64 bit file sizes. (Insert remark

about porn here)

5, Just to make life fun thought 64 bit usualy means an

increase in memory size and memory foot print size. Since

the procesor will want everything aligned on 8 byte

boundaries. (In the alpha world we use unalign to find

these issues) In fact this may be a problem on Opteron and

G5 for years to come.

About chip features I just remembered my 68b09e chip (8 bit)

had a 16 bit multiply. The 2nd source for the chip a 6309

added an 32 bit multiply.

Just some fuel for thought.

Donaldson

2003-08-23 7:47 pm

Anonymous
They could have been added as an extension to a 32 bit processor, but they aren’t. AMD has added them to their line of 64 bit processors. Most people I know aren’t too interested in the Opteron for its 64 bit instruction set/address space. Rather, we are more keen on the performance increase we will get due to the increase in GPRs, and on-die memory controller.

I don’t understand what you are arguing about. I have already said that I agree with what you’ve said about the Athlon 64 being a better chip. My purpose in responding to you earlier was merely to point out that you had taken rajan r’s post out of context. Most likely it was unintentional, but accidental or not it made rajan r appear to say something that he doesn’t actually agree with.

It just struck me, are you null_pointer from the GameDev.net message boards?

Yes. Do I know you?

2003-08-23 9:58 pm

Anonymous
Not arguing about anything, just making my points clearer

When I used to frequent GameDev some years ago, I used the handle of NuffSaid.

2003-08-23 11:38 pm

Anonymous
Donaldson: Are you Allan Cox’s Brother? or are you Allan Cox? Oh… Now I remember… You know Alpha…and the problems that Allan encountered with maintain the Alpha port.

Anyway… AMD totally and completely misses the point with a 64/32 Hybrid. Sell the hell out of Opteron, then when the server market is saturated, make a socket 462 version ( i.e. 32-bit external ) and crash the desktop market.

Does anyone remember the 8088? Like 8-bit external, 16 bit internal? or how about the P54C overdrive? Make an old 486 run Pentium code. So with hardware you can easily turn 64 bits into 32, by multiplexing them. So what if its slower… your running on a legacy board… its got to be faster than a Athlon XP 2100! ( yea.. thats 130nm process also, so there is NOT going to be a speed up on scale, but perhaps a newer core will speed things up… but with a 64-bit core, then you get faster internal processing… computationally intensive tasks will enjoy the speedup..)

Intel did this, several times, and did it well., except for starting off on the wrong foot with the Vacancy signs…

As for switching modes or the prevention thereof..they can either do the broken connectors on top ( pencil in your 64-bit upgrqades ) or break a few interconnects off…( under the epoxy layer, or burn a hole on the chip…they have been stupid about it in the past, and will probibly not learn from their mistakes….

(Donaldson*) = (god)

2003-08-24 12:23 am

Anonymous
Would you mind telling me how you came up with all of these predictions out of one report of one site supposedly getting leaked information about products that haven’t yet been released? I’d also like to know what relation your ramblings have to the new story, and I’d like to know why you think you attach “Allan Cox’s Brother” to them. Is that simply to make your points appear more credible?

Anyway… AMD totally and completely misses the point with a 64/32 Hybrid. Sell the hell out of Opteron, then when the server market is saturated, make a socket 462 version ( i.e. 32-bit external ) and crash the desktop market.

Is that AMD’s strategy, or is that the strategy you think that they ought to adopt? According to The Register, both of the AMD64 core-based chips will be using the same 754-pin socket. Both chips will be able to operate in 64-bit mode and run 64-bit code, except that one will have a 32-bit bus and/or reduced cache (the story wasn’t clear on this part of the speculation). This doesn’t differ significantly from the Duron/Athlon strategy that AMD adopted quite successfully in recent times; in fact, it’s very, very similar. I don’t see you providing enough of a basis to predict AMD failing here.

As for switching modes or the prevention thereof..they can either do the broken connectors on top ( pencil in your 64-bit upgrqades ) or break a few interconnects off…( under the epoxy layer, or burn a hole on the chip…they have been stupid about it in the past, and will probibly not learn from their mistakes….

Nowhere does the article state or even imply that either of the two AMD64 consumer chips will be unable to execute x86-64 code. The article merely speculates that one of the processors will be shackled to a 32-bit bus, hurting its performance somewhat. And no, I’m not stupid enough to come to a conclusion about its performance until credible hardware review sites get their hands on production versions of these chips, if and when they are actually produced.

(Donaldson*) = (god)

He also hasn’t said anything to support or deny your ideas.

2003-08-24 12:45 am

Anonymous
Not arguing about anything, just making my points clearer

It is like Greek to me. I can’t make out your sentences, no offence. Besides, I’m not saying that Athlon 64 is a bad chip. It is a good chip. It would probably save AMD (and, unfortunately, x86).

But I original post was

“The cpu speeds, lately, seem to be increasing painfully slow – like a snail on a bed of nails. And why only 64 bits? Why not just go for a 256 bit chip and skip all the useless drivel inbetween!!?!? Doesn’t anyone INNOVATE any more???? Doesn’t even one company believe in making a better product instead of pedalling the same old shit with some miniscule improvement? And i want a boom box that’ll play both mp3’s and ogg files. Where the hell is it??? DO IT NOW!!!! I’ll but it!!!”

2003-08-24 1:10 am

Anonymous
sorry nope, might be cool though. I’ve been playing etc with linux for about 11 years now. I’ve done a little kernel work over the years and help support Loki with it’s Alpha port of games. (And some rawhide testing). Just been throught the mill once too often.

I’m hoping AMD dosen’t f all of this up. My points were why people haven’t jumped up to 128 bit buses. The demand just isn’t there. It’s like using RISC in embedded cpu. 2 k of memory and a pointer costs what…. (Favorite embedded cpu is the PIC series, fun to program and the cmos/ttl logic saves parts)

I’m not sure what AMD is up to either, I was just expressing my hopes for a Socket A opyteron, but I was wrong. It appears right now there are at least 3 diffrent chips in the 2 different markets. AMD is messing up.

And on the issue of 256 bit chip (again). My personal desire is to see an analysis of the efficiency of the chips per frequency. I mean the PIII did more than the P1V per clock cycle. I think an efficiency per clock cycle would be a nice study.

Also a note on the issue of the address bus size. Very few servers take 4 TBi of memory. Until something like a Sun E15k uses 4Tbi I dont think their is any demand to go to 256 bit address bus. And how big is the socket for that monster. 256 address lines 256 data lines Best case 530 to 550 pins, ugh circuit routing that bad boy would not be fun. I would rather write a vertias port to Linux. I guess you could clip the address lines down to 48 (set high bits to zero) and that could save 400+ pins. owwww.

O on a side note, Once these “moder programmers” learn how to use an AVL tree I’ll worry about speeds. Way too many clock cycles ate spent on double linked list searching for data.

Sorry about the random thought firing. Long day.

Donaldson

2003-08-24 3:36 am

Anonymous
I’m not sure what AMD is up to either, I was just expressing my hopes for a Socket A opyteron, but I was wrong. It appears right now there are at least 3 diffrent chips in the 2 different markets. AMD is messing up.

I’m afraid that I don’t understand the problem. With the Duron and Athlon, AMD had 2 chips in 1 market. Now that AMD has branched out into the server market, they have 3 chips in 2 different markets. Why would that indicate that they are messing up? It’s the same architecture, x86-64, and the strategy is almost identical to the Duron/Athlon strategy with the exception of branching out into the server market.

And on the issue of 256 bit chip (again). My personal desire is to see an analysis of the efficiency of the chips per frequency. I mean the PIII did more than the P1V per clock cycle. I think an efficiency per clock cycle would be a nice study.

Well, that’s one part of the equation. There’s the megahertz myth that hung over the G4, and then there’s the work per cycle myth that hung over the P4. Why don’t we just evaluate chips in the context of what they were intended to do and then determine which gets us the best price/performance/heat/whatever deal for job X in situation Y. Both the G4 and P4 were pretty successful for what they were intended to do. Maybe what I’d like to see is a comparison of the number of those chips sold vs. the number of people who predicted that they would fail utterly.

O on a side note, Once these “moder programmers” learn how to use an AVL tree I’ll worry about speeds. Way too many clock cycles ate spent on double linked list searching for data.

I agree there’s a certain segment of the programming community that wrongly assumes performance to be irrelevant, and there is certainly a lot of room for improvement in the size and execution speed of most programs, but on the other hand it just isn’t feasible to optimize every line of code in a several million line program.

Good programmers don’t build optimum programs right away; rather, good programmers design programs in such a way that they are easy to optimize later. The first goal is correctness; the program must do what it says it does and be fairly solid (e.g., hard to introduce new errors). Only after the program is built can we evaluate the effectiveness of our optimizations on the code. If I’ve properly abstracted the data structure used in a widely-used algorithm, then changing it from a doubly-linked list to an AVL tree should be fairly trivial.

What do you think of HP’s Dynamo? It’s a dynamic optimizer. I bet that type of software could be very proficient at eliminating the “abstraction penalties” so prevalent in modern code. I wonder what an Itanium chip paired with something like Dynamo integrated into the OS would be capable of.

2003-08-24 4:09 am

Anonymous
Hello,

On dynamo,

I love dynamo, THe main abstraction penality it eliminates is the library penality. It also does a good job of reinlineing small segments of code. ( There is another one of these beast out there, I think it was Dec. THis big difference was Dec’s went from mips to alpha while dynamo went from HPcpu to an HP cpu)

On programming,

I agree code has to run correctly but I have seen SO much software that was never designed to run fast. People use the STL and implement list because they are easy. THey never think of the data set going beyond the size 10. A new example is the limit of number of songs in itunes and winamp. If the programmers would just get real data structures and use them during the design phase a lot of programms would run faster. They might also work on the thread logic too. and windows could find threads for the gui.

Actually replaceing a list with a tree can be different. THe tree may prevent duplicate keys and while a tree may easily have the ability to process like a list O(1) the deletes are O(log(n)) (i think maybeits nlogn .) Also A threaded tree (as in multiple process threads) needs a little fore work. I can think of many real world examples of this problem.. You don’t know how much this makes me mad. fin AVL trees are old, so why do people but sorted data in list of binary trees. And if speed is important take the fin STL and throw it out the damn window..

NOTE: Anybody that wants an template for an AVL tree drop me an email I will mail it to you. It’s also on my web site http://www.aztechnologyinc.com.

On clock efficiency

I’m not willing to restart the MHz war that rages here every week. What I’m looking at (not being a chip designer) is why is the PIV slower than the PIII when clocked at the same frequency. I remember the K6 had a really good branch predictor that the K7 didn’t get. It seems to me there must be an perfect design for a IA32 chip. What I mean is if

I take a full permutation of instructions there must be a perfect time for these to execute. I would like to see this set of instructions run on the various chips and that compared to the “perfect cpu”. I think it would be interesting to see the result.

G4 clock myth

I agree buy the chip for you task, but the G4 is long in the tooth. whenever I get in the flame war it’s due to compaines making stupid statements. I saw one yesterday from a cpu company but I lost the refrence. (On a side note, The print ads for a G5 show the motherboard but the soder pads are unfilled.) I’m waiting for a bump on the xserve.And G5 xserve running linux may be my next website upgrade, Or I may buy vms for my alpha. (same price…)

Donaldson

2003-08-24 4:31 am

Anonymous
Hello,

I thought I would throw out another problem about 64 bit chips, espically chips running risc at their core. (ergo all chips currently shipping) Branch predicition.

Alpha aka DEC was the only company I know of that told GCC how many nops to insert into the code after a branch to prevent prefetch stall in the CPU. I wonder what the branch, and unaligned code penality will be on the G5 and Opteron (IA32/64).

Hmmm I wonder when the xserve gets a bumb will it get ECC memory and scsi. hmmmm

Donaldson

2003-08-24 7:33 am

Anonymous
espically chips running risc at their core. (ergo all chips currently shipping)

x86-64 is RISC? Opteron is selling now…

2003-08-24 7:47 am

Anonymous
RE rajan:

Yes the Opteron may use cisc instructions but I think nuch like the intel chips is uses a small RISC core to do the real work. This is what I understand, Even though I doubt AMD will say one way or the other. The point of the comment was about branch prediction causeing a pipeline stall. In a 64 bit or greater the penality is worse because their is more data to refill into the cache. (faster cache to memory transfers gets you less forgivness when flushing.) After the GCC team patch the Alpha target to insert nops after a branch the code runs faster. (nops being easier to flush) Well with the opteron will a nop after the branch “help the core” and of corse memory aligment issues too.

Donaldson

2003-08-24 8:30 am

Anonymous
Not arguing about anything, just making my points clearer

It is like Greek to me. I can’t make out your sentences, no offence. Besides, I’m not saying that Athlon 64 is a bad chip. It is a good chip. It would probably save AMD (and, unfortunately, x86).

Sorry my bad. I’ll try speaking in English this time

No where did you say the Athlon 64 was going to be a bad chip. However you did say that when developing for x86-64 development time will increase for ‘obvious’ reasons. I do not see the ‘obvious’ reasons and would like you to explain further.

Else where you claim certain things about processors which are dubious at best.

In todays 32-bit x86 processors, there is a lot of things not 32-bit. The same with the 286, a lot of features are actually 32-bit. The processor itself may not be 64-bit, but heck, we don’t need 64-bit now. And those who actually do already are using 64-bit.

My question was what ‘a lot of features’ of the 286 were actually 32 bit? I made a point to show that in todays 32 bit processor, *most* things are 32 bits, giving the example of the most common primitive data type (integers or int) which is 32 bits. Your claim that a lot of things are not 32-bit is very vague. Could you substantiate your claims?

2003-08-24 9:09 am

Anonymous
No where did you say the Athlon 64 was going to be a bad chip. However you did say that when developing for x86-64 development time will increase for ‘obvious’ reasons. I do not see the ‘obvious’ reasons and would like you to explain further.

I said developing for 64-bit would increase development time. This isnt’ the case for x86-64 because it also has legacy support for 32-bit applications.

I made a point to show that in todays 32 bit processor, *most* things are 32 bits, giving the example of the most common primitive data type (integers or int) which is 32 bits. Your claim that a lot of things are not 32-bit is very vague.

I never disagree that most of the things within x86 is 32-bit, that’s why it is called 32=bit. However there is some limitations of 32-bit that the software world can’t live by, and guess what? It has been killed off with extensions, etc.

Now, just to phrase myself properly so you won’t me understand anymore.

I AGREE with the development of x86-64.

I AGREE that 64-bit wouldn’t be such a waste of time and money today as we are approaching the limits of 32-bit.

I DISAGREE with the concept that 64-bit = 2x the speed of 32-bit.

I DISAGREE with the need of anything above 64-bit, including 256-bit processors.

Is there anything else I haven’t made myself clear about?

2003-08-24 5:27 pm

Anonymous

I said developing for 64-bit would increase development time. This isnt’ the case for x86-64 because it also has legacy support for 32-bit applications.

My question was how will developing for 64 bits increase development time? I see no reason why development time will suffer. You allude to ‘obvious’ reasons, but no reason is obvious to me. This applies to x86-64, Alpha, IA64, UltraSPARC III, etc. With x86-64, most of the time, a simple recompile will do the trick and you end up with a working 64 bit app. So what reasons could there be for the increase in development time?

I never disagree that most of the things within x86 is 32-bit, that’s why it is called 32=bit. However there is some limitations of 32-bit that the software world can’t live by, and guess what? It has been killed off with extensions, etc.

That wasn’t what you said the last time. You said “todays 32-bit x86 processors, there is a lot of things not 32-bit.” That’s a pretty vague statement, and all I was asking was for you to clarify what these a-lot-of-things are. Same goes for the 286, where you claim a-lot-of-things are 32 bit.

I am not trying to nit pick or start a fight with you. I am actually interested in learning more about cpu architecture having recently finished a book on it, and your vague statements including in your last post about how the limits of 32 bits have been killed of by extensions … is just vague. What limits are you talking about? What extensions are you referring to? Are you referring to the problem of address space? Is the extension you are referring to PAE?

2003-08-24 5:54 pm

Anonymous
hehe

Okay so I sure unsure of where you stand…

I Concur with your 4 points I would like to add though

I DISAGREE with AMD’s current market segmentation tatic.

Here is the current chart as far As I can make sense of it.

The 754 Asus board does support ECC so I would say all 754 chips will.

Pins Mem Channels Name Launch ECC # cpus

940 2 Athlon 64 FX 2Q03 Y 2+

Opteron FX

939 2 Athlon 64 FX 1Q04 Y 1

754 1 Athlon 64 4Q03 Y/N 1+

462 varies Athlon 4Q00 Y/N 1-2

Athlon MP

I personally thought the 754 would drop all ECC support but no it keeps it. So the segmentation is 939/940 Bussiness 754 is for home user. THe 939 will not work in 940 sockets there for no cheap multip cpu machines. 3 chips with 3 sockets I think thats bad. I personally wonder if the 939 will not just replace the 940 and the 939 will have multicpu ability.

Anyone can clear this up?

Donaldson

2003-08-24 6:37 pm

Anonymous
Hello,

Actually a recompile rarley ever does the trick. In the linux world it may seem that way but the applications are still not 64 bit clean after 8 years. (and linus has an Alpha).

When porting to 64 bit you must go through your entire code base and root out the following.

1. Pointers treated like integers.

2. any type of bit manipulation

3. bit ordering tricks. (I usually us ntoh to fix ythese)

4. Writing data structures. In a 64 bit world the structures

will be align on 8 byte boundaries so a simple

recompile will alter the byte placement of structures

. Well a LOT of applications will dumb a structure

to the disk to save. To retain portability you

need to do a dynamic translation in and out.

You have to repack for 32 bit compatibility.

5. Pointers are all new sizes so all sub libraries

need to be rebuilt.

6. No purify enough said there.

7. Threads are 64 bit.

8. Integers are 64 bit so the file access to 32 bit

filesystems have to be checked.

9. All packets over the nextwork need to be cleaned.

10. Memory aligment issues. (Someone should port unalign

to the opteron.)

THis is assumming you have a working bug free 32 bit app. Crafting a 64 bit is easier as long as you reteach the programmers not to muck with the pointers.

(((unsinged int))(char*) a) ++

And heaven forbid it’s old MS code with near and far pointers. ugh.

11. If anyone hardcoded a malloc needs to be fixed.

sizeof(int) == 8

sizeof(void*) == 4

12. The compiler may do the following to you…..

sizeof(short int) == 2

sizeof(int) == 4

sizeof(long int) == 4

sizeof(long long int) == 8

sizeof(void*) == 8

If anyone has an Opteron and can check this????

And for the G5 I might add.

13. If the code was written well and already ported to 64 bit it should recompile and only take a week or so to verify and test. If it’s never been ported it can take 2 months.

Note: I’ve done this porting beforefor loki and sometimes you never can track down allof the problems.

Donaldson

2003-08-24 6:42 pm

Anonymous
Found this review of G4, G5 AMD chips on http://www.os-News.com (hyphen)… very informative: http://www.os-news.com/modules.php?op=modload&name=News&file=articl…

2003-08-24 6:48 pm

Anonymous
I read that ealier, interesting article. I think the single proc G5 with the 900 Mhz connection to Hypertransport is going to be the sweet spot. Now that the machines are shipping the benchmrks should flow….

Donaldson

2003-08-24 7:18 pm

Anonymous
I agree there could be problems with porting, but most of the problems that you’ve listed are really due to the programmer’s fault. Here’s my take on them, and they are by no means perfect. Do correct me as you see fit.

1. Pointers treated like integers.

This is probably a remnant of C and its excessive use of void pointers. Using a more modern language like C++, should be able to reduce this.

2. any type of bit manipulation

3. bit ordering tricks. (I usually us ntoh to fix ythese)

Currently, the most common bit ordering problems that I run into are mainly solvable by ntoh and hton (just like you said), so no real biggie.

Bit masking will have some problems, especially with regards to integer size. Shouldn’t be too hard to fix, just replace the ints with shorts. It could get tedious after a while….

4. Writing data structures.

You must agree that just dumping data structures onto disk is a bad programming practice. You lose almost all portability, as the representation of the data structure on disk will mirror the data structure’s representation in memory on that particular machine. You’ve already outlines the solution I would have come up with. As an aside, I think this makes a great case for text based formats.

5. Pointers are all new sizes so all sub libraries

need to be rebuilt.

7. Threads are 64 bit.

8. Integers are 64 bit so the file access to 32 bit

filesystems have to be checked.

Agreed, and you’ve given solutions. Not sure why threads being 64 bit are a problem …

6. No purify enough said there.

Which purify are we talking about? Rational Purify? Sorry if I’m sounding stupid…

9. All packets over the nextwork need to be cleaned.

What do you mean cleaned?

10. Memory aligment issues. (Someone should port unalign

to the opteron.)

Againg due to changes in data type sizes. This isn’t such a big proglem if you don’t dump data structures directly onto disk, don’t try to access members of structures using pointer offsets (i.e. *(some_pointer + 6)). Programmers who do that on pretext of ‘optimizing’ (hey, I’ve actually heard that before) ought to be shot.

11. If anyone hardcoded a malloc needs to be fixed.

This is one way of doing malloc

int *a = malloc(sizeof(int) * 10);

This is a better way of doing malloc

int *a = malloc(sizeof(*a) * 10);

Do not hard code sizeof. If you change the data type of a, you’ll have to change the parameter of sizeof too. The 2nd way removes the need for such redundant checks, and reduces the chances of errors.

sizeof(short int) == 2

sizeof(int) == 4

sizeof(long int) == 4

sizeof(long long int) == 8

sizeof(void*) == 8

Don’t know about what the C language specifies, but this is how it is with the Win32 API. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wi…

Personally, I think most of these problems do not apply to porting a Java (not sure about .NET) app to 64 bits. Which I think is a good reason to advocate their use.

2003-08-24 7:27 pm

Anonymous
My bad. In my previous post, I very wrongly assumed that the integer size would change when moving from 32 bits to 64 bits. I should have done proper research, before posting fallacies.

http://cedar.intel.com/software/idap/media/pdf/esp/MigraTEC_Rev1.pd…

It appears that integers and longs will still be 4 bytes (i.e. 32 bits), while pointers will become 64 bits.

Also, can someone point me to an online copy of the C99 standard? I am under the impression that the C standard doesn’t guarantee the size of its data types, and it only guarantees that they will obey the following rule:

sizeof(char)<=sizeof(short)<=sizeof(int)<=sizeof(long)

To anyone who has a copy of the C99 standard, is that correct?

2003-08-24 8:28 pm

Anonymous
That about what the standard sasys. ( I don’t have it handy) The closer wording was an int is the base word length for the chip. (yuck) The only hard coded number I remember from the standard is NULL = 0 (What idiot did that I will never know.

Don’t forget on proting code.

1. Most code is legacy so you inherit all of the problems.

2. Ehen porting you don’t want to start changing type unless you have too. It modifies all of the call signatures, stack offsets and means that every procedure/method ends up with #ifdef structures.

3. On saving structures to disk, most games do this and the windows serialize methods pull the same trick. (3dsmax does something similiar.) so this is an compatibility and portability issue.

4. If you want a quick idea of how bad idea how bad code is take a file and under gcc use the optiins -Wall -pedantic -ansi and compile away. (I usually find fixing all of these error eliminates 90% of the problems)

Enumerated items

1. Treating pointers like integers is mainly in speed sections. (I happen to use void a lot in c++ for call backs and playing multiple inheritance self destroying class games. But in these speed sections there may also be hand coded assembler. (Usually matrix and vector math.) I gues what I’m saying still very common in C++, too powerful of an optimization tool to pass up. I mean it’s a quick way to do a lot of things.

6. Yes, Rational purify. THat is (insert your primary deity

here) gift to coders.

7. The coder might be storing a thread handle in shared memory so the shared memory will need to be increasd in size. Well if the are two applications and only one is ported to 64 bit it will want a larger shared emory segement with a larger thread handle. Not to mention sinc the free stroe usually comes from the top of memory it will have it’s high bits set so clipping it to 32 bits is not going to work. THe flip case is the stack grows down from the top. <joy>

9. Depending on how the code is dumping data across a network connection. You no longer can issue a write with the sizeof (struct). (Common way to do it I migh add).

10. Aligment.

THis is more than pointer math this can effect the size

of a structure and if a structure can fit on one page

of memory. Usually an unaligned memory access cost

extra cpu cycles.

11. Hardcoded malloc

Either way works okay just as long as it’s not malloc(10)

C++ standard pg 31 Section 3.9.1 part 4

There are four signed integer types: “signed char”, “short int”, “int”, and “long int.” In thislist, each type provides at least as much storage as those preceding it in the list, but the implementation can otherwise make any of them equal in storage size. Plain ints have the natural size suggested by themachine architecture; the other signed integer types are provided to meet special needs.

ummm, I think I ran out of things to type.

Donaldson

2003-08-24 11:27 pm

Anonymous
Amazing. This is getting good, and I am learning stuff!

I’ve never seen the need to treat pointers as integers, and most certainly not for speed. I was under the impression that compilers were good enough these days to not need so much pointer hacking anymore. Yes, I have looked at the assembler listing of programs that I’ve compiled (with MSVC, don’t understand AT&T), and they look about the same.

With alignment, couldn’t this be left up to the compiler? I know most compilers have a flag to let you align your data structures (in MSVC, this defaults to 8 bytes).

After looking at this thread on problems with porting, I am getting more convinced that these problems do not occur in a VM environment like Java/.NET. But writing in Java or a .NET enabled language brings the problem of waiting for someone to port the runtime. Ah well, can’t have everything.

2003-08-25 1:02 am

Anonymous
On speed pointers.

If you want to see speed go write fortran code for a while. Fortran has a lot of bad points but it grinds numbers the fastest of anything out there.re some old techniques for writing self modifying code useing pointer math. (Been 12 years since I did that though). If you really want to see the quest for speed look up duffs device. Or the use of gotos. I have some code laying around that implements exception handling in C using setjmp and lngjmp.

The compiler may support alignment flags but if the supporting libraries aren’t compiled with the flag the calling stacks become unaligned and causes segdumps. (Took me a month to realize that one.)

Also MS code will break it you activate the ansi flag, O thier mktemp call is broken too. I wish those people would buy a copy of purify and run their code tree through it. sigh.

I notice your in the UK. I have been trying to find a name of a programming language I ran into 11 years ago from “over the pond”. It’s critical feature was all statements in the language ran in parallel. In order to have serial code you had to mark the sections. Ever heard of it???

There are some examples I have seen of pointer math but

the most common I can think of is expressing an 3×3 matrix as a double mat[9] then you can use mat+n to pull out elements. This drops an multiplication in most cases. Also there a

On vm languages:

Your right about VM languages as long as you avoid any machine or other “specific areas” not defined in the spec. Sorta like relying on the finalize method to be called for Java. This isssue about VM languages verses compile languages is on it’s third pass. The rule is if you next absolute speed and or absolute control use a compiler. .NET has it’s own octpus of issues but There is a port of .NET you could work with (name has left my brain.)

When it comes to optimizations I was optimizing my vector template a few months ago and it’s amaizing how many cpu cycles you can save.

CPE_Vector<data> operator * ( const CPE_Matrix<data> & )const;

void Mul_V_M ( const CPE_Vector<data> & ,

const CPE_Matrix<data> & );

inline void Mul_V3_M3 ( const CPE_Vector<data> & ,

const CPE_Matrix<data> & );

inline void Mul_V4_M4 ( const CPE_Vector<data> & ,

const CPE_Matrix<data> & );

inline void Mul_V4_M3 ( const CPE_Vector<data> & ,

const CPE_Matrix<data> & );

inline void Mul_V3_M4 ( const CPE_Vector<data> & ,

const CPE_Matrix<data> & );

the first one is the “correct academic solution” the rest of the methods are the ones used in the real world. Academic soultions are really good in school not industry. In fact the only mistake NeXTstep had (besides Jobs) was objective C. An idustry language should not do runtime binding. If you’re playing under linux you can time your code using jiffies. (make sure it loops a lot of times I recommed the code runs at least 30 seconds) I got them from the proc file system but they broke recently. shrug. So I moved to using times() to benchmark code.

And making a function call is more expensive in 64 bit because you have to pop a 64 bit PC off the stack instead of an 32 bit one

O well

Donaldson

2003-08-25 3:22 am

Anonymous
what i want is a program that levies the total cpu usage vs total cpu wastage so that all you fools buying expensive shit see how much money you’re wasting on the fastest shit.

2003-08-25 3:46 am

Anonymous
But 30 frames per minute is not equal to 30 frames per second. I may I have “fast ” equipment but then again when I’m using running both cpu’s around 85 percent. The system you’re propsing is the old billing system from batch systems. It works only if you don’t have a deadline for a job to get done. But then again if you only see in 2 colors….(refrence old sega commercial for the handheld game system)

donaldson

2003-08-25 3:56 am

Anonymous
If I get into marketing, I can highlight a hundred and one things wrong with AMD marketing campaign and still leave out a few more. For example, AMD Me can become a well known household name – just for one problem. The amount of advertising for it is much lower than Intel Inside, and unlike Intel, they apparently are reducing their budget. They should know the importance of a good recognizable brand.

2003-08-25 4:06 am

Anonymous
My question was how will developing for 64 bits increase development time? I see no reason why development time will suffer. You allude to ‘obvious’ reasons, but no reason is obvious to me. This applies to x86-64, Alpha, IA64, UltraSPARC III, etc. With x86-64, most of the time, a simple recompile will do the trick and you end up with a working 64 bit app. So what reasons could there be for the increase in development time?

Read Donaldson’s post. I have no time to post for myself, and Donaldson is nice and simple for you to understand. And, no, a recompile wouldn’t end you up with a working 64-bit app, you would end up with an app more optimized for x86-64. And no, taking the long and tedious journey of porting to 64-bit wouldn’t make your app faster than it would with 32-bit on x86-64. Porting stuff that would see a speed increase would be smarter.

That wasn’t what you said the last time. You said “todays 32-bit x86 processors, there is a lot of things not 32-bit.” That’s a pretty vague statement, and all I was asking was for you to clarify what these a-lot-of-things are. Same goes for the 286, where you claim a-lot-of-things are 32 bit.

I refuse to debate with someone who don’t even know this simple logic. Okay, go Google up the limitations of 32-bit. Than Google up the limitations of x86. You would find less on the x86 list. Plus, it pretty much differs from processor to processor.

And no, I didn’t mean the addressing space, which is one of the big reasons for x86-64.

2003-08-25 7:52 am

Anonymous
I refuse to debate with someone who don’t even know this simple logic. Okay, go Google up the limitations of 32-bit. Than Google up the limitations of x86. You would find less on the x86 list. Plus, it pretty much differs from processor to processor.

Yeah, whatever. Donaldson’s posts (and hopefully my replies) contain substance, something which your post severely lacks. Using your logic, I’ve googled for 32 bit stuff on the 286, hoping to find loads of references to them. Zip.

So yeah, whatever. Post your vague statements and then ask others to google for what you are trying to say.

2003-08-25 8:13 am

Anonymous
Is sacrificing code readability for the extra speed gained by using jumps and gotos worth it? I’ve done a little fortran, and didn’t like it at all. I think I’ll stick to code I can read, and take a small speed hit

I don’t know what parallel langauge that is. 11 years ago, I was 10 and was only programming in BASIC. Perhaps you are referring to Goedel, which I’ve only come across in lectures.

I’ve never heard of misaligned libraries causing segdumps. I’ve had code compiled for different structure alingments and they all worked fine. You sure it was due to alignment issues?

The thing about VM languages is that the spec tends to define practically everything that you need, unless you’re doing really low level stuff like device driver writing. After 3 years of Java, I’ve only had to go beyond the Java spec and use JNI when working with legacy C programs. It wasn’t a pleasant experience, which probably explains why I’d rather let legacy code die and start a fresh I don’t know .NET enough to be aware of its serious issues, but I’m guessing that it is similar to those faced by Java.

In a 64 bit program, given that the underlying hardware’s registers and datapaths will be 64 bits, wouldn’t popping a 64 bit program counter be the same as popping a 32 bit program counter on a 32 bit machine?

With regards to your vector multiplcation methods, the “academic” solution is more general, and can handle multiplication if the size of your vector and matrix aren’t 3 or 4. But if 3 or 4 is the most common size of your vectors and matrices, by all means optimize them. I’ve always got extra marks for writing efficient code.

2003-08-25 6:36 pm

Anonymous
RE: Fortran and speed

The problem is it’s not a little code gain. In it’s domain, Fortan being math work. Generic, bad written fortran is faster than C, and a lot faster than c++

Re Optimizer:

I think this was th optioin that cause me the problem

-malign-double

I comoiled a C++ program and used streams. I opened a file using streams. When the streams exited (fell out of scope and needed to be deleted) the code would seg fault.

I like VM languages like Java too but my current code needs to be fast and Java is way to slow. shrug maybe my next project. And on the other side it’s a LOT easier to get gcc for a chip than a VM. My alpha has a really old VM because of lack of intrest but I get cxx which is a better compiler than gcc.

RE PC,

Still double the data comming off the stack and it more than likely is a one cycle instruction but I was trying to express that everything “may” make the bit increase so something that worked before may break. O well keeps me employed.

RE academic.

THe problem was it was a little increase in speed. Moving away from the generic case was a jump in maganitude of cpu usage. The copy constructor and return values hurt.

I’ll try to post some supporting numbers later, coffee calls.

Donaldson

2003-08-25 6:44 pm

Anonymous
Here is those numbers I was talking about. THis is on a 32 bit IA32 platform. The target numbers are fortran. Basically the higher the jiffies diff the closer to the academic solution. I’m looking forward to the Opteron cutting these numbers by 40 percent since this is all double math.

Donaldson

Cross_Product Target (3)

Jiffies diff :1

Memory diff :0

Cross_Product * P(3)

Jiffies diff :84

Memory diff :0

Cross_Product * L(3)

Jiffies diff :84

Memory diff :0

Cross_Product Cross P(3)

Jiffies diff :17

Memory diff :0

Cross_Product Cross L(3)

Jiffies diff :16

Memory diff :0

Cross_Product Cross_3 P(3)

Jiffies diff :5

Memory diff :0

Cross_Product Cross_3 L(3)

Jiffies diff :1

Memory diff :0

2003-08-26 8:53 am

Anonymous
What are the jiffies diff? What kind of measurement is that? Also, what score would the academic version get? A jiffies diff of 100?

And so, if one gets a jiffies diff of 1, does that mean it is 100x faster than the academic version?

2003-08-26 6:04 pm

Anonymous
Currently jiffies diff is:

(clock ticks start) – ( clock ticks end)

So it the number of clock ticks to run this test. I use automated regression testing and automated benchmarking on my library code.

Originally it was the number of clock jiffies from the proc/id num/stat file , however 2.4 of the kernel broke it for some reason. So the current version is using times (refrence man 2 times)

This result is just a vector cross product (size 3 vectors)

Academic: 84 clock ticks

Fortran code: 1 clock tick

So the Fortran code is 84 times faster than the acadmic code.

///////////////////////////

//

// TARGET Cross_Product (3)

//

///////////////////////////

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

(c.i)=(a.j*b.k)-(a.k*b.j);

(c.j)=(a.k*b.i)-(a.i*b.k);

(c.k)=(a.i*b.j)-(a.j*b.i);

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product Target (3)” << std::endl;

foo_out << ” Jiffies diff :” << cyc_out-cyc_in << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

//delete a;

//delete b;

//delete c;

vec_d_0=new CPE_Vector<double>(-1.0, 2.0, 1.0);

vec_d_1=new CPE_Vector<double>( 3.0, 1.0,-1.0);

vec_d_2=new CPE_Vector<double>(-3.0, 2.0,-7.0);

vec_d_3=new CPE_Vector<double>(3);

vec_d_0_3[0]=1.0; vec_d_1_3[0]=1.0; vec_d_2_3[0]=1.0;

vec_d_0_3[1]=1.0; vec_d_1_3[1]=1.0; vec_d_2_3[1]=1.0;

vec_d_0_3[2]=1.0; vec_d_1_3[2]=1.0; vec_d_2_3[2]=1.0;

// Operator *

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

(*vec_d_3)=(*vec_d_0)*(*vec_d_1);

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product * P(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

// Operator *

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

vec_d_2_3 = vec_d_0_3 * vec_d_1_3;

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product * L(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

// Cross Pointer

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

vec_d_3->Cross((*vec_d_0),(*vec_d_1));

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product Cross P(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

// Cross Local

vec_d_0_3[0] = 1.0; vec_d_0_3[1] = 2.0; vec_d_0_3[2] = 3.0;

vec_d_1_3[0] = 1.0; vec_d_1_3[1] = 2.0; vec_d_1_3[2] = 3.0;

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

vec_d_2_3.Cross(vec_d_0_3,vec_d_1_3);

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product Cross L(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

// Cross_3

vec_d_0_3[0] = 1.0; vec_d_0_3[1] = 2.0; vec_d_0_3[2] = 3.0;

vec_d_1_3[0] = 1.0; vec_d_1_3[1] = 2.0; vec_d_1_3[2] = 3.0;

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

vec_d_2->Cross_3((*vec_d_0),(*vec_d_1));

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product Cross_3 P(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

// Cross_3

vec_d_0_3[0] = 1.0; vec_d_0_3[1] = 2.0; vec_d_0_3[2] = 3.0;

vec_d_1_3[0] = 1.0; vec_d_1_3[1] = 2.0; vec_d_1_3[2] = 3.0;

Get_Mem_Size(mem_in);

Get_Num_Cycles(cyc_in);

for(count_0 = 2000000; count_0 > 0 ; –count_0)

{

vec_d_2_3.Cross_3(vec_d_0_3,vec_d_1_3);

}

Get_Num_Cycles(cyc_out);

Get_Mem_Size(mem_out);

foo_out << ” Cross_Product Cross_3 L(3)” << std::endl;

foo_out << ” Jiffies diff :” << (cyc_out-cyc_in) << std::endl;

foo_out << ” Memory diff :” << mem_out-mem_in << std::endl;

delete vec_d_0;

delete vec_d_1;

delete vec_d_2;

delete vec_d_3;

The nomenclature at the end

P = Pointer deref in calling

L= No pinter deref (& in arg list)

3 = size of vector.

here is a small example