Findfirst/findnext with a samba share

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Findfirst/findnext with a samba share

Max Vlasov
Hi,

Working with lazarus on Linux Mint I noticed that when I mount a windows shared with "Connect to server", everything works fine (Nautilus, Midnight Commander, Lazarus file operations). But meeting that .gvfs folder is hidden, I tried to mount "mount -t cifs" directly  in the shell.

After such mapping Nautilus and mc worked the same, but Lazarus stopped seeing some of files (for example 3 from the folder containing 76).

I tried to run lazarus in the debugger and narrowed it to findFirst/Next that calls fpReadDir and this one calls linux' getdents64 to list directoty entries (since it's fpc rtl, I'm posting here). Querying for mentioning this at the web, I found that there were reports in the bugtracker that attributed this to a samba bug (https://bugzilla.samba.org/show_bug.cgi?id=8044 ).

The question is what is different in nautilus and midnight commander comparing to fpc directoy-listing functions that allows them to list directories correctly in the bug's case?

Thanks,

Max Vlasov

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Ludo Brands
On 03/01/2013 09:14 AM, Max Vlasov wrote:

> Hi,
>
> Working with lazarus on Linux Mint I noticed that when I mount a windows
> shared with "Connect to server", everything works fine (Nautilus,
> Midnight Commander, Lazarus file operations). But meeting that .gvfs
> folder is hidden, I tried to mount "mount -t cifs" directly  in the shell.
>
> After such mapping Nautilus and mc worked the same, but Lazarus stopped
> seeing some of files (for example 3 from the folder containing 76).
>
> I tried to run lazarus in the debugger and narrowed it to findFirst/Next
> that calls fpReadDir and this one calls linux' getdents64 to list
> directoty entries (since it's fpc rtl, I'm posting here). Querying for
> mentioning this at the web, I found that there were reports in the
> bugtracker that attributed this to a samba bug
> (https://bugzilla.samba.org/show_bug.cgi?id=8044 ).
>
> The question is what is different in nautilus and midnight commander
> comparing to fpc directoy-listing functions that allows them to list
> directories correctly in the bug's case?
>

The difference with other tools is that FPC gives a very small buffer to
getdents64. The strace for ls in the bug report
http://bugs.freepascal.org/view.php?id=23732 shows that only one
getdents is needed to get the full dir because the buffer size is much
bigger.
I don't know why getdents64 gets a buffer of only 280 bytes from FPC.
This seems to me ridiculously low in modern systems and it has also an
impact on the speed of findFirst/findNext.

Ludo

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Max Vlasov

On Fri, Mar 1, 2013 at 12:32 PM, Ludo Brands <[hidden email]> wrote:
On 03/01/2013 09:14 AM, Max Vlasov wrote:
>
> The question is what is different in nautilus and midnight commander
> comparing to fpc directoy-listing functions that allows them to list
> directories correctly in the bug's case?
>

The difference with other tools is that FPC gives a very small buffer to
getdents64.

Thanks, did I understand this right? It's a samba bug, but it appears only in a specific context of small buffers, other programs (nautilus, mc) usually use larger buffers so has a kind of workaround this bug.

Max

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Ludo Brands
On 03/01/2013 10:55 AM, Max Vlasov wrote:

>
> On Fri, Mar 1, 2013 at 12:32 PM, Ludo Brands <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 03/01/2013 09:14 AM, Max Vlasov wrote:
>     >
>     > The question is what is different in nautilus and midnight commander
>     > comparing to fpc directoy-listing functions that allows them to list
>     > directories correctly in the bug's case?
>     >
>
>     The difference with other tools is that FPC gives a very small buffer to
>     getdents64.
>
>
> Thanks, did I understand this right? It's a samba bug, but it appears
> only in a specific context of small buffers, other programs (nautilus,
> mc) usually use larger buffers so has a kind of workaround this bug.
>

The bug results in dropping a file between 2 consecutive calls to
getdents/getdents64. Making the buffer large enough for most common
cases is indeed a workaround, though I doubt that other programs use
bigger buffers solely to work around a samba bug.

Perhaps one of the core devs can comment on the reason for the small
buffer size in FPC.

Ludo
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Marco van de Voort
In reply to this post by Ludo Brands
In our previous episode, Ludo Brands said:

> >
> > The question is what is different in nautilus and midnight commander
> > comparing to fpc directoy-listing functions that allows them to list
> > directories correctly in the bug's case?
> >
>
> The difference with other tools is that FPC gives a very small buffer to
> getdents64. The strace for ls in the bug report
> http://bugs.freepascal.org/view.php?id=23732 shows that only one
> getdents is needed to get the full dir because the buffer size is much
> bigger.
> I don't know why getdents64 gets a buffer of only 280 bytes from FPC.
> This seems to me ridiculously low in modern systems and it has also an
> impact on the speed of findFirst/findNext.
>

Maybe even wrong. I checked the FreeBSD implementation for the buffersize
(1kb btw), and there is the following comment (probably based on some
comment from FreeBSD sources):

"Getdents requires the buffer to be larger than the blocksize.
This usually the sectorsize =512 bytes, but maybe tapedrives and harddisks
with blockmode have this higher?"

Maybe Linux has a similar requirement.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Sven Barth-2
On 01.03.2013 12:31, Marco van de Voort wrote:

> In our previous episode, Ludo Brands said:
>>>
>>> The question is what is different in nautilus and midnight commander
>>> comparing to fpc directoy-listing functions that allows them to list
>>> directories correctly in the bug's case?
>>>
>>
>> The difference with other tools is that FPC gives a very small buffer to
>> getdents64. The strace for ls in the bug report
>> http://bugs.freepascal.org/view.php?id=23732 shows that only one
>> getdents is needed to get the full dir because the buffer size is much
>> bigger.
>> I don't know why getdents64 gets a buffer of only 280 bytes from FPC.
>> This seems to me ridiculously low in modern systems and it has also an
>> impact on the speed of findFirst/findNext.
>>
>
> Maybe even wrong. I checked the FreeBSD implementation for the buffersize
> (1kb btw), and there is the following comment (probably based on some
> comment from FreeBSD sources):
>
> "Getdents requires the buffer to be larger than the blocksize.
> This usually the sectorsize =512 bytes, but maybe tapedrives and harddisks
> with blockmode have this higher?"
>
> Maybe Linux has a similar requirement.

It seems that at least in 2001 there was an entry in the man page about
this, but in current man pages about getdents(64) there is nothing about
it. Maybe we should check whether the following would work (in
rtl/linux/ossysc.inc, fpreaddir):

Currently FPC allocates only one pdirent in fpopendir
(rtl/linux/ossysc.inc). Maybe it should first stat the directory and
then decide based on st_blksize how much pdirent entries to allocate
(but it should also provide a sane default, as there is the possiblity
that st_blksize is 0 for a directory).

Regards,
Sven
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Ludo Brands
On 03/01/2013 12:52 PM, Sven Barth wrote:

> Currently FPC allocates only one pdirent in fpopendir
> (rtl/linux/ossysc.inc). Maybe it should first stat the directory and
> then decide based on st_blksize how much pdirent entries to allocate
> (but it should also provide a sane default, as there is the possiblity
> that st_blksize is 0 for a directory).
>

The dirent struct on linux contains a char d_name[256]; as the last
element. The kernel returns only a null terminated string instead of the
full 256 chars. Reason why you can get many files returned in one 280
byte block. Allocating n pdirent entries for n files would be an overkill.
The glibc getdents.c source uses some heuristics to determine how many
entries will fit in a buffer. They use an average of 14 chars for the
filename.
The same source code also says in comments that the kernel limits the
number of entries returned per call but doesn't say what the limit is.
It probably depends on the underlying files system. This could explain
why programs such as ls that use a 32k buffer never see the problem (at
least I haven't found any reports).
What about just using a 32k buffer?

Ludo

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Michael Van Canneyt


On Fri, 1 Mar 2013, Ludo Brands wrote:

> On 03/01/2013 12:52 PM, Sven Barth wrote:
>
>> Currently FPC allocates only one pdirent in fpopendir
>> (rtl/linux/ossysc.inc). Maybe it should first stat the directory and
>> then decide based on st_blksize how much pdirent entries to allocate
>> (but it should also provide a sane default, as there is the possiblity
>> that st_blksize is 0 for a directory).
>>
>
> The dirent struct on linux contains a char d_name[256]; as the last
> element. The kernel returns only a null terminated string instead of the
> full 256 chars. Reason why you can get many files returned in one 280
> byte block. Allocating n pdirent entries for n files would be an overkill.
> The glibc getdents.c source uses some heuristics to determine how many
> entries will fit in a buffer. They use an average of 14 chars for the
> filename.
> The same source code also says in comments that the kernel limits the
> number of entries returned per call but doesn't say what the limit is.
> It probably depends on the underlying files system. This could explain
> why programs such as ls that use a 32k buffer never see the problem (at
> least I haven't found any reports).
> What about just using a 32k buffer?

No problem with that as far as I am concerned.

Since the code is shared between embedded and non-embedded targets,
it might be better to have the option to specify the size of the
buffer to use.

In recursive cases, it means that Depth*BufferSize buffers will be allocated.

Michael.
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Sven Barth-2
In reply to this post by Ludo Brands
On 01.03.2013 16:08, Ludo Brands wrote:

> On 03/01/2013 12:52 PM, Sven Barth wrote:
>
>> Currently FPC allocates only one pdirent in fpopendir
>> (rtl/linux/ossysc.inc). Maybe it should first stat the directory and
>> then decide based on st_blksize how much pdirent entries to allocate
>> (but it should also provide a sane default, as there is the possiblity
>> that st_blksize is 0 for a directory).
>>
>
> The dirent struct on linux contains a char d_name[256]; as the last
> element. The kernel returns only a null terminated string instead of the
> full 256 chars. Reason why you can get many files returned in one 280
> byte block. Allocating n pdirent entries for n files would be an overkill.
> The glibc getdents.c source uses some heuristics to determine how many
> entries will fit in a buffer. They use an average of 14 chars for the
> filename.

While the kernel might pass less than 256 characters the dirent
structure contains a "dd_nextoff" field which is already used in FPC's
fpreaddir call to locate the next returned entry.

Regards,
Sven
_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Ludo Brands
On 03/01/2013 06:02 PM, Sven Barth wrote:

>
> While the kernel might pass less than 256 characters the dirent
> structure contains a "dd_nextoff" field which is already used in FPC's
> fpreaddir call to locate the next returned entry.
>

Yes, I know. Otherwise we would lose more than one file name per buffer,
and not only with samba :)

Ludo


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Ludo Brands
In reply to this post by Michael Van Canneyt
On 03/01/2013 04:37 PM, Michael Van Canneyt wrote:

>
> On Fri, 1 Mar 2013, Ludo Brands wrote:
>
>> What about just using a 32k buffer?
>
> No problem with that as far as I am concerned.
>
> Since the code is shared between embedded and non-embedded targets, it
> might be better to have the option to specify the size of the buffer to
> use.
>
Being able to override the buffer size at runtime would indeed be the
best solution. Embedded systems with a file system and short on memory
to accommodate a 32k block is rather uncommon these days.

> In recursive cases, it means that Depth*BufferSize buffers will be
> allocated.
>
Yes.

Ludo


_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

mikedillon
This post was updated on .
In reply to this post by Max Vlasov
CONTENTS DELETED
The author has deleted this message.
tvr
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

tvr
Hello,

do you plan change behavior of findnext function to eliminate samba bug ?

tomas
Reply | Threaded
Open this post in threaded view
|

Re: Findfirst/findnext with a samba share

Jonas Maebe-2

On 05 Apr 2013, at 18:32, tvr wrote:

do you plan change behavior of findnext function to eliminate samba bug ? 

It probably won't eliminate it, just make it more rare. You can probably get the same result with "ls" if you put so many files in a directory that their combined info does not fit in a 32kb block. Possibly in addition to some workaround in FPC, it really should be fixed in samba in the first place.


Jonas

_______________________________________________
fpc-pascal maillist  -  [hidden email]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal