Memory alignment with FPC

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Memory alignment with FPC

Darius Blaszyk

Does FPC ensure the same memory alignment for records and objects over different platforms? If I want to be sure to have the same alignment must I use packed (with possibly some aligning assigned) records instead in combination with data types that are guaranteed to be of the same size over the different platforms (eg byte, word, single) ? Or do unpacked records behave the same over different platforms?

 

When memory is alligned (either with {$PACKRECORDS N} or unpacked, are the padding bytes quaranteed to be #0 or are they undefined?

 

Any help appreciated.

 

Regards, Darius

 

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2

On 10 Oct 2012, at 12:02, [hidden email] wrote:

Does FPC ensure the same memory alignment for records and objects
over different platforms?

No, alignment is defined by the platform ABI.

If I want to be sure to have the same
alignment must I use packed (with possibly some aligning assigned)
records instead in combination with data types that are guaranteed to be
of the same size over the different platforms (eg byte, word, single) ?

Yes.

When
memory is alligned (either with {$PACKRECORDS N} or unpacked, are the
padding bytes quaranteed to be #0 or are they undefined? 

They are undefined.


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Sven Barth-2
In reply to this post by Darius Blaszyk
Am 10.10.2012 12:02, schrieb [hidden email]:
> Does FPC ensure the same memory alignment for records and objects over
> different platforms? If I want to be sure to have the same alignment
> must I use packed (with possibly some aligning assigned) records instead
> in combination with data types that are guaranteed to be of the same
> size over the different platforms (eg byte, word, single) ? Or do
> unpacked records behave the same over different platforms?

Please note that you might also need to take care of different endianess
(big endian vs. little endian) if you want to transfer such record data
(e.g. through files) between systems with different endianess. In that
case you'll need to write the record field by field and convert the
endianess in one of the two cases...

Regards,
Sven
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk

On 10 okt '12, Sven Barth wrote:

Am 10.10.2012 12:02, schrieb [hidden email]
Does FPC ensure the same memory alignment for records and objects over different platforms? If I want to be sure to have the same alignment must I use packed (with possibly some aligning assigned) records instead in combination with data types that are guaranteed to be of the same size over the different platforms (eg byte, word, single) ? Or do unpacked records behave the same over different platforms?
Please note that you might also need to take care of different endianess 
(big endian vs. little endian) if you want to transfer such record data 
(e.g. through files) between systems with different endianess. In that 
case you'll need to write the record field by field and convert the 
endianess in one of the two cases...

Regards,
Sven
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Thanks Jonas and Sven... Endianess is indeed an issue I will be handling. It's my understanding now that by storing data in a packed record I will have the same behaviour (when it comes to aligning data in memory) over all platforms and architectures. Same applies to packed objects. However when objects have methods, does that change memory alignment (compared to different architectures)? Is there any impact when I start using object inheritance?

 

The background behind my questions is the implementation of an IFF reader/writer in FPC. The reader/writer will be generated from available data types (simple ones as well as objects and records). To simplify the code I want to make use of the abstraction that FPC gives me.

 

Regards, Darius

 

 

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk
In reply to this post by Jonas Maebe-2

On 10 okt '12, Jonas Maebe wrote:


On 10 Oct 2012, at 12:02, [hidden email] wrote:

Does FPC ensure the same memory alignment for records and objects
over different platforms?
No, alignment is defined by the platform ABI.

If I want to be sure to have the same
alignment must I use packed (with possibly some aligning assigned)
records instead in combination with data types that are guaranteed to be
of the same size over the different platforms (eg byte, word, single) ?
Yes.

When
memory is alligned (either with {$PACKRECORDS N} or unpacked, are the
padding bytes quaranteed to be #0 or are they undefined? 

They are undefined.
Jonas

 

One more question, when using packed records, is there anything to say about performance? Are there some tests anywhere that show how the performance is impacted?

 

Darius

 

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2
In reply to this post by Darius Blaszyk

On 10 Oct 2012, at 14:31, [hidden email] wrote:

Jonas and Sven... Endianess is indeed an issue I will be handling. It's
my understanding now that by storing data in a packed record I will have
the same behaviour (when it comes to aligning data in memory) over all
platforms and architectures. Same applies to packed objects. However
when objects have methods, does that change memory alignment (compared
to different architectures)? Is there any impact when I start using
object inheritance? 

You should never write objects containing VMTs to disk as a whole, that will only lead to trouble. Inheritance does not introduce VMTs in objects, but virtual methods, constructors and destructors do.

One more
question, when using packed records, is there anything to say about
performance? Are there some tests anywhere that show how the performance
is impacted? 

It will depend on the target platform, the order of the fields and types of the fields. It is impossible to say in general what the impact will be. It can be very large or non-existent.


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Nico Erfurth-2
In reply to this post by Darius Blaszyk
On 10.10.12 14:40, [hidden email] wrote:

> One more question, when using packed records, is there anything to say
> about performance? Are there some tests anywhere that show how the
> performance is impacted?

This highly depends on the architecture/processor.

Many architectures (like older ARMs) can not handle unaligned access at
all. Such an access would create an exception, which has to be handled
by the system (Linux provides the capability to emulate these faults,
but at a VERY high performance cost). Thats why fpc emits byte loads for
halfword and word loads on packed records. But for a words these will
also take about 7-15 cycles, depending in the compiler version and core
used. Newer ARM implementations can handle unaligned access, but only
under some performance penalty.

x86 can handle unaligned access, but most implementations (I think
current atoms and via nano are an exception) will suffer a rather high
performance penalty.

If you only use packed records to get your data from and to disk/network
you'll not see much of a difference, but you should avoid using a packed
record inside something which is performance sensitive.

Nico
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Mark Morgan Lloyd-5
In reply to this post by Sven Barth-2
Sven Barth wrote:

> Am 10.10.2012 12:02, schrieb [hidden email]:
>> Does FPC ensure the same memory alignment for records and objects over
>> different platforms? If I want to be sure to have the same alignment
>> must I use packed (with possibly some aligning assigned) records instead
>> in combination with data types that are guaranteed to be of the same
>> size over the different platforms (eg byte, word, single) ? Or do
>> unpacked records behave the same over different platforms?
>
> Please note that you might also need to take care of different endianess
> (big endian vs. little endian) if you want to transfer such record data
> (e.g. through files) between systems with different endianess. In that
> case you'll need to write the record field by field and convert the
> endianess in one of the two cases...

I've had some success defining a custom := operator for this.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Bernd K.
In reply to this post by Darius Blaszyk
2012/10/10  <[hidden email]>:
> However when
> objects have methods, does that change memory alignment

No. As long as there are no virtual methods it will not affect the
size or the memory layout. From what I have read the new gtk3 API for
fpc will also use objects (with methods and also inheritance) instead
of records to represent gtk and glib structures.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk
In reply to this post by Darius Blaszyk

On 10 okt '12, [hidden email] wrote:

One more question, when using packed records, is there anything to say about performance? Are there some tests anywhere that show how the performance is impacted?

I did some performance tests on win32 and it appears that both packed and unpacked objects and records all show exactly the same performance. Writing the individual variables in a record or object to file takes about 5.5 times longer than writing them at once. If someone wants my test app to run it on other platforms please let me know then I can post the code. I will do more testing later on mac and linux32. I'm interested how win64 and linux64 behave in this respect. So if someone has these architectures please let me know.

 

This makes me wonder if choosing a proper value for $PACKRECORDS could make my file readable safely on all platforms, only needing to convert the endianess if applicable. This would not force me to do manual padding in my structs. Say I use a value of 16 would that cover all ABI's FPC currently supports?

 

Jonas: do you have an overview of the alignment on all architectures that FPC supports? Perhaps you could pinpoint where in the compiler this is handled? If appreciated I could make a patch to include this info in the documentation in the future.

 

Regards, Darius


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2

On 11 Oct 2012, at 13:59, [hidden email] wrote:

I did some performance tests on win32 and it appears that
both packed and unpacked objects and records all show exactly the same
performance. Writing the individual variables in a record or object to
file takes about 5.5 times longer than writing them at once. If someone
wants my test app to run it on other platforms please let me know then I
can post the code. I will do more testing later on mac and linux32. I'm
interested how win64 and linux64 behave in this respect. So if someone
has these architectures please let me know. 

As mentioned before, it not only depends on the platform, but also on the contents of the object/record. E.g., a badly misaligned double will generally give much worse performance even on Intel.

This makes me wonder if
choosing a proper value for $PACKRECORDS could make my file readable
safely on all platforms, only needing to convert the endianess if
applicable. This would not force me to do manual padding in my structs.
Say I use a value of 16 would that cover all ABI's FPC currently
supports? 

Yes.

Jonas: do you have an overview of the alignment on all
architectures that FPC supports?

The information is not just architecture-specific, but also OS-specific (e.g. the alignment of int64 is 4 on Darwin/i386, but 8 on all other i386 platforms). This is defined in the platform ABI documents (application binary interface).

Perhaps you could pinpoint where in the
compiler this is handled? If appreciated I could make a patch to include
this info in the documentation in the future. 

It's a combination of tdef.alignment (and its overridden methods in compiler/symdef.pas), tdef.structalignment (idem) and the varalign information in compiler/systems/i_*.pas. And the latter information in turn can be overridden by the programmer with -Oa switch and the {$codealign ...} directive, or is sometimes also adjusted by us when e.g. new data types are introduced, when bugs are found or when support for a new ABI is added that has different requirements (some OSes support multiple ABIs).

I don't think documenting it in our manual is a good idea. It's not something people should depend on beyond what the official platform ABIs say, and those documents are maintained separately from our manual (and unfortunately seldom have stable URLs that can be referred to).


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Marco van de Voort
In reply to this post by Nico Erfurth-2
In our previous episode, Nico Erfurth said:
> x86 can handle unaligned access, but most implementations (I think
> current atoms and via nano are an exception) will suffer a rather high
> performance penalty.

I thought most modern x86's only had a penalty when an unaligned
access crossed a cacheline boundery ? (32 bytes now, 64 bytes on Haswell)

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk
In reply to this post by Jonas Maebe-2

On 11 okt '12, Jonas Maebe wrote:

As mentioned before, it not only depends on the platform, but also on the contents of the object/record. E.g., a badly misaligned double will generally give much worse performance even on Intel.

This makes me wonder if
choosing a proper value for $PACKRECORDS could make my file readable
safely on all platforms, only needing to convert the endianess if
applicable. This would not force me to do manual padding in my structs.
Say I use a value of 16 would that cover all ABI's FPC currently
supports? 
Yes.

So misalignment of for instance a double (or whatever type) will only happen if the record is packed and the packed value is smaller than what the ABI prescribes, correct?

 

Let's assume I set the record to packed 16bytes, this would make reading and writing records as a whole safe on all platform/architecture combinations right? Apart from a few padding bytes, what are the performance penalties of doing this then? Why would there be penalties?

Darius


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2

On 11 Oct 2012, at 15:00, [hidden email] wrote:

> So misalignment of for instance a
> double (or whatever type) will only happen if the record is packed and
> the packed value is smaller than what the ABI prescribes, correct?

Yes.

> Let's assume I set the record to packed 16bytes, this would make
> reading and writing records as a whole safe on all platform/
> architecture
> combinations right? Apart from a few padding bytes, what are the
> performance penalties of doing this then? Why would there be  
> penalties?

The cpu cache will contain lots of unused padding bytes.


Jonas
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Marco van de Voort
In our previous episode, Jonas Maebe said:
> > reading and writing records as a whole safe on all platform/
> > architecture
> > combinations right? Apart from a few padding bytes, what are the
> > performance penalties of doing this then? Why would there be  
> > penalties?
>
> The cpu cache will contain lots of unused padding bytes.

And operations that move records will move more bytes. (e.g. reallocation).

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk
In reply to this post by Jonas Maebe-2

On 11 okt '12, Jonas Maebe wrote:

On 11 Oct 2012, at 15:00, [hidden email]wrote:
So misalignment of for instance a double (or whatever type) will only happen if the record is packed and the packed value is smaller than what the ABI prescribes, correct?
Yes.
Let's assume I set the record to packed 16bytes, this would make reading and writing records as a whole safe on all platform/ architecture combinations right? Apart from a few padding bytes, what are the performance penalties of doing this then? Why would there be penalties?
The cpu cache will contain lots of unused padding bytes. 

Thanks, I think everything is clear now. My plan now is to respect default padding and write records in one go to disk. The padding value will be written to the file header so the records can be read back one variable at a time when padding differs, otherwise they will be read back in one go again. This will sure come at a cost, but only if the file is shared between different ABI's (as is the case when sharing between different endianess). The result will be that the data structures will be at default padding internally allways making optimal use of the CPU.

 

So is there a way to get the padding value at runtime?

 

Darius


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2

On 11 Oct 2012, at 15:23, [hidden email] wrote:

Thanks, I think everything is clear now. My plan now is to
respect default padding and write records in one go to disk. The padding
value will be written to the file header so the records can be read back
one variable at a time when padding differs, otherwise they will be read
back in one go again. This will sure come at a cost, but only if the
file is shared between different ABI's (as is the case when sharing
between different endianess). The result will be that the data
structures will be at default padding internally allways making optimal
use of the CPU. 

So is there a way to get the padding value at runtime?

No. You really should write the fields one by one. Yes, it's slower. That's the cost of portability. You can always optimize by first writing them to a buffer and then writing the buffer in one go.


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Mark Morgan Lloyd-5
In reply to this post by Marco van de Voort
Marco van de Voort wrote:
> In our previous episode, Nico Erfurth said:
>> x86 can handle unaligned access, but most implementations (I think
>> current atoms and via nano are an exception) will suffer a rather high
>> performance penalty.
>
> I thought most modern x86's only had a penalty when an unaligned
> access crossed a cacheline boundery ? (32 bytes now, 64 bytes on Haswell)

In any event, I run FPC and Lazarus on SPARC which is susceptible to
misalignment and am not currently aware of any problems.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Darius Blaszyk
In reply to this post by Jonas Maebe-2

On 11 okt '12, Jonas Maebe wrote:


On 11 Oct 2012, at 15:23, [hidden email] wrote:

Thanks, I think everything is clear now. My plan now is to
respect default padding and write records in one go to disk. The padding
value will be written to the file header so the records can be read back
one variable at a time when padding differs, otherwise they will be read
back in one go again. This will sure come at a cost, but only if the
file is shared between different ABI's (as is the case when sharing
between different endianess). The result will be that the data
structures will be at default padding internally allways making optimal
use of the CPU. 

So is there a way to get the padding value at runtime?

No. You really should write the fields one by one. Yes, it's slower. That's the cost of portability. You can always optimize by first writing them to a buffer and then writing the buffer in one go.
Jonas

Sorry I keep asking questions, but why write them one by one? If I would store the offset each variable has at the time of writing (only need to do one time per record type), I could easily make the loading work (even if the ABI changes when the file is read back). What makes you prefer writing the variables one by one over once at a time?

Darius

 

 

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Memory alignment with FPC

Jonas Maebe-2

On 11 Oct 2012, at 16:11, [hidden email] wrote:

On 11 okt '12, Jonas Maebe wrote:

No. You really should write the fields one by one. Yes,
it's slower. That's the cost of portability. You can always optimize by
first writing them to a buffer and then writing the buffer in one go.

Sorry I keep asking questions, but why write them one by one? If
I would store the offset each variable has at the time of writing (only
need to do one time per record type), I could easily make the loading
work (even if the ABI changes when the file is read back). What makes
you prefer writing the variables one by one over once at a time?

I always prefer simple techniques over elaborate strategies aimed at optimizing things, especially if it's not clear that they will ever be the performance bottleneck in the first place. You're moreover trading space (storing all the offsets) for cpu operations here, and I/O is generally two or more orders of a magnitude slower than moving data in memory.


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
12