Discussion:
[datatable-help] Random segfaults
Chris Neff
2011-12-14 15:46:26 UTC
Permalink
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults. The
only common thing between every crash is that it happens when I do

DT[, z := x]

where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter. Beyond
that I can't reproduce a set of steps that gets R to crash. This is
with the latest SVN version.

Is there more information I can provide to help track this down?
Matthew Dowle
2011-12-14 16:43:43 UTC
Permalink
You're R < 2.14.0, right? I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0. CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.

So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.

Are you 64bit pre-2.14.0? Which OS? If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults. The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter. Beyond
that I can't reproduce a set of steps that gets R to crash. This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-14 17:13:37 UTC
Permalink
64 bit 2.12.1 linux.

Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
help, would just doing:

options(datatable.alloc=quote(1000))

stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-14 17:31:06 UTC
Permalink
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-14 17:33:19 UTC
Permalink
Nope. Created fresh every time.
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Timothée Carayol
2011-12-14 17:40:15 UTC
Permalink
Hi --

I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.

t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-14 18:00:50 UTC
Permalink
It is crashing several times an hour for me to the point where this is
unusable. Setting the alloccol option didn't help. Is there a way I
can go back to the old shallow copy ways of 1.7.3?
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-14 20:43:05 UTC
Permalink
I found and fixed one simple error that would definitely explain this in
R <= 2.13.2. Just committed fix. Please grab latest svn and try again. I
also added more checks and warnings in case that's not the only problem.
Post by Chris Neff
It is crashing several times an hour for me to the point where this is
unusable. Setting the alloccol option didn't help. Is there a way I
can go back to the old shallow copy ways of 1.7.3?
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
Post by Matthew Dowle
You're R < 2.14.0, right? I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0. CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS? If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults. The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter. Beyond
that I can't reproduce a set of steps that gets R to crash. This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-14 20:44:16 UTC
Permalink
I found and fixed one simple error that would definitely explain this in
R <= 2.13.2. Just committed fix. Please grab latest svn and try again. I
also added more checks and warnings in case that's not the only problem.
Post by Chris Neff
It is crashing several times an hour for me to the point where this is
unusable. Setting the alloccol option didn't help. Is there a way I
can go back to the old shallow copy ways of 1.7.3?
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
Post by Matthew Dowle
You're R < 2.14.0, right? I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0. CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS? If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults. The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter. Beyond
that I can't reproduce a set of steps that gets R to crash. This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-15 00:56:21 UTC
Permalink
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great. Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling using R
CMD INSTALL. Next time it happens I mean. Can also run test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
Post by Matthew Dowle
You're R < 2.14.0, right? I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0. CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS? If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults. The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter. Beyond
that I can't reproduce a set of steps that gets R to crash. This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-15 12:36:19 UTC
Permalink
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling using R
CMD INSTALL. Next time it happens I mean. Can also run test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around the
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to make
over-allocation work because R only started to initialize truelength to 0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in linux 32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on 64bit
too but just 2.14.0.  CRAN is also showing errors on 2.13.2 (old-rel) for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try and fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it adds
weight to me installing pre-2.14.0 on my 64bit instance in an effort to
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to reproduce
it, but the past few days I've been getting a lot of segfaults.  The
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is either an
existing column of DT or a separate variable, doesn't matter.  Beyond
that I can't reproduce a set of steps that gets R to crash.  This is
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-15 13:32:02 UTC
Permalink
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition. That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.

1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling using R
CMD INSTALL. Next time it happens I mean. Can also run test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.  This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-15 13:43:44 UTC
Permalink
Latest SVN version, no alloccol set, still crashing a lot. I don't
use [<- or $<-, the only times I modify a data.table are with := or
by doing DT=merge(DT,blah).

Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling using R
CMD INSTALL. Next time it happens I mean. Can also run test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from disk?
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the over-allocation
niceties if it stops things from crashing. Looking at the truelength
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is my
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I do
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.  This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-15 14:43:26 UTC
Permalink
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Post by Chris Neff
Latest SVN version, no alloccol set, still crashing a lot. I don't
use [<- or $<-, the only times I modify a data.table are with := or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling using R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0 to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this down?
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-15 14:52:45 UTC
Permalink
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Post by Chris Neff
Latest SVN version, no alloccol set, still crashing a lot. I don't
use [<- or $<-, the only times I modify a data.table are with := or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-15 15:42:53 UTC
Permalink
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.

All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets grid
methods base

other attached packages:
[1] hexbin_1.26.0 lattice_0.19-33 RColorBrewer_1.0-5
data.table_1.7.8 ggplot2_0.8.9 reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Steve Lianoglou
2011-12-15 15:50:03 UTC
Permalink
Hi,

Out of curiosity, is it impossible for you to upgrade R to the latest, or?

-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-15 16:08:09 UTC
Permalink
Internal build of R. Can't upgrade until they do. I think it is
unlikely to see 2.14 any time soon.

On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-15 17:26:48 UTC
Permalink
Just to come back, it still crashes at seemingly random times. I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't fully
installed it properly and is still (perhaps partially) at 1.7.3. So quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Matthew Dowle
2011-12-15 20:05:06 UTC
Permalink
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...

sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz

R -d gdb

run

Do the stuff that crashes it. Does it report a C file and line number?

Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN. It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Post by Chris Neff
Just to come back, it still crashes at seemingly random times. I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Post by Chris Neff
Internal build of R. Can't upgrade until they do. I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats graphics grDevices utils datasets grid
methods base
[1] hexbin_1.26.0 lattice_0.19-33 RColorBrewer_1.0-5
data.table_1.7.8 ggplot2_0.8.9 reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Post by Chris Neff
Latest SVN version, no alloccol set, still crashing a lot. I don't
use [<- or $<-, the only times I modify a data.table are with := or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition. That 'should' all be fine
now in both <=2.13.2 and >=2.14.0, although the bug was something simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great. Or, we've sometimes seen that
just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't
fully
installed it properly and is still (perhaps partially) at 1.7.3. So
quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small data);
still I guess I should now signal that 2.14 didn't fix everything for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
Post by Matthew Dowle
You're R < 2.14.0, right? I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
too but just 2.14.0. CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
fix.
Are you 64bit pre-2.14.0? Which OS? If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
The
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-16 14:38:28 UTC
Permalink
On the current latest SVN build, with debugging enabled as listed
below, I get the following when trying to even print the contents of a
data.table:

Error in do.call("cbind", lapply(x, format, justify = justify, ...)) :
  'getCharCE' must be called on a CHARSXP

Never saw this error without debugging. I tried printing a few times
in a row, got this same error, and then like the 4th time it
segfaulted.

Having a hard time reproducing that, but at least it is something?
Post by Matthew Dowle
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...
sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz
R -d gdb
run
Do the stuff that crashes it.  Does it report a C file and line number?
Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Just to come back, it still crashes at seemingly random times.   I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just
to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use [<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be
fine
now in both <=2.13.2 and >=2.14.0, although the bug was something
simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that
just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't
fully
installed it properly and is still (perhaps partially) at 1.7.3. So
quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for
the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small
data);
still I guess I should now signal that 2.14 didn't fix everything
for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-16 15:20:20 UTC
Permalink
Just posting things as I find them. I run my script (and it makes it
through no complaints), but then I just try to modify it slightly more
like:

DT[, w := x*y]

where x,y are both integer columns of DT (and w doesn't previously
exist), and I get the following:

Error in match(as.vector(x), y, 0L) :
'translateCharUTF8' must be called on a CHARSXP

If I then try to print DT again I get the same error as above:

Error in do.call("cbind", lapply(x, format, justify = justify, ...)) :
'getCharCE' must be called on a CHARSXP


The problem is I cant get this to reproduce on simpler code. So I
just have to tell you what I see when I see it.
Post by Chris Neff
On the current latest SVN build, with debugging enabled as listed
below, I get the following when trying to even print the contents of a
  'getCharCE' must be called on a CHARSXP
Never saw this error without debugging.  I tried printing a few times
in a row, got this same error, and then like the 4th time it
segfaulted.
Having a hard time reproducing that, but at least it is something?
Post by Matthew Dowle
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...
sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz
R -d gdb
run
Do the stuff that crashes it.  Does it report a C file and line number?
Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Just to come back, it still crashes at seemingly random times.   I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just
to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use
[<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be
fine
now in both <=2.13.2 and >=2.14.0, although the bug was something
simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check
results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that
just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't
fully
installed it properly and is still (perhaps partially) at 1.7.3. So
quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for
the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small
data);
still I guess I should now signal that 2.14 didn't fix everything
for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-16 15:36:27 UTC
Permalink
Sorry but this is just crashing too often to keep trying with it. I
can't get a really reproducible example, so I'll just explain the
sorts of circumstances that seem to make it happen:

1) It always seems to need to be a really large dataset. For instance,
mine was about 2.5 million rows and 20 columns.
2) My dataset has a factor that is unique to every row as a key, so a
factor with 2.5 million levels (don't know if that matters but
throwing it out there).
3) Crashes seem to happen most when trying to make a new column, and
also bizarrely when trying to use ggplot. A lot of crashes happen
when I try to plot subsets of the data with ggplot.

As one more piece of data, I tried to take a subset of my data.table,
and do str on that subset. so

d <- DT[x<10]

str(d)

and got this error:

*** caught segfault ***
address (nil), cause 'unknown'

Traceback:
1: encodeString(lev.att, na.encode = FALSE, quote = "\"")
2: str.default(object[[i]], nest.lev = nest.lev + 1, indent.str =
paste(indent.str, ".."), nchar.max = nchar.max, max.level =
max.level, vec.len = vec.len, digits.d = digits.d, give.attr =
give.attr, give.head = give.head, give.length = give.length, width
= width, envir = envir, list.len = list.len)
3: str(object[[i]], nest.lev = nest.lev + 1, indent.str =
paste(indent.str, ".."), nchar.max = nchar.max, max.level =
max.level, vec.len = vec.len, digits.d = digits.d, give.attr =
give.attr, give.head = give.head, give.length = give.length, width
= width, envir = envir, list.len = list.len)
4: str.default(d, give.length = FALSE)
5: NextMethod("str", give.length = FALSE, ...)
6: str.data.frame(d)
7: str(d)


Once again it is hard to reproduce though.

At this point I have to get some real work done so I'm reverting back
to 1.7.1 until someone comes up with a new fix or thing for me to try.
Just posting things as I find them.  I run my script (and it makes it
through no complaints), but then I just try to modify it slightly more
DT[, w := x*y]
where x,y are both integer columns of DT (and w doesn't previously
 'translateCharUTF8' must be called on a CHARSXP
 'getCharCE' must be called on a CHARSXP
The problem is I cant get this to reproduce on simpler code.  So I
just have to tell you what I see when I see it.
Post by Chris Neff
On the current latest SVN build, with debugging enabled as listed
below, I get the following when trying to even print the contents of a
  'getCharCE' must be called on a CHARSXP
Never saw this error without debugging.  I tried printing a few times
in a row, got this same error, and then like the 4th time it
segfaulted.
Having a hard time reproducing that, but at least it is something?
Post by Matthew Dowle
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...
sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz
R -d gdb
run
Do the stuff that crashes it.  Does it report a C file and line number?
Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Just to come back, it still crashes at seemingly random times.   I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the latest, or?
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure. As
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
from R-Forge won't include the fix yet. So svn up, then R CMD build, then
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a clean
install after a reboot to make sure no old .so is still knocking around
somehow please. Definitely installed to the right library? If it's
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I don't
use [<- or $<-, the only times I modify a data.table are with :=  or
by doing DT=merge(DT,blah).
Any more info I can provide?
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol option just
to
be sure please, that would be great. You're our best hope of confirming
it's fixed since it was biting you several times an hour. If you use
[<-
or $<- syntax then R will copy via *tmp* and at that point the *tmp*
data.table is similar to a data.table loaded from disk in that it isn't
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should' all be
fine
now in both <=2.13.2 and >=2.14.0, although the bug was something
simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN check
results
tick over from "ERROR" to "OK" later today (for both windows and mac
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to let you
know if it still crashes (however I do have the alloccol option set to
1000, so I shouldn't be bumping into reallocation very often). Thanks
for finding the bug so fast!
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it was in R
2.14. There have been quite a few fixes since 1.7.4 so if you can
reproduce with 1.7.7 would be great.  Or, we've sometimes seen that
just
after a package upgrade that a clean re-install can often fix things.
Perhaps if the .so was in use by another R process or a zombie, or
something. R seems to report data.table v1.7.4 (say) but it hasn't
fully
installed it properly and is still (perhaps partially) at 1.7.3. So
quit
all R (reboot to clear zombies too perhaps) and try reinstalling
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14, data.table
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to data.frame for
the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring seemingly
at random, and I was doing operations which aren't really what
data.table was made for (tons of little manipulations on small
data);
still I guess I should now signal that 2.14 didn't fix everything
for
me. I do not know whether bugs subsist on post-1.7.4 versions.
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects from
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to work
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a time.
You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to initialize
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random). Trouble is
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for me in
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too. I test
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on 2.13.2
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll continue to
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance in an
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem to
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of segfaults.
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it happens when I
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and x is
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't matter.
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to crash.
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track this
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Matthew Dowle
2011-12-16 15:43:39 UTC
Permalink
Great, thanks. Have seen this quite a bit, see FAQ 4.3. It indicates an
earlier memory corruption happened, could have been at any point. It's not
anything to do with locale or CHARSXP. The next step is to follow all the
steps in section 4.3 of R-exts. Turn on gctorture, --use-gct,
--enable-strict-barrier, and, valgrind especially. The goal is to detect
where the earlier corruption is happening.

On the tenterhook front, 1.7.7 is now passing CRAN checks for oldrel (both
mac and windows) fully OK so that means the last fix definitely fixed the
problem I found, so that's some progress.

But, since 1.7.7+ doesn't fix it for you it means either :

i) you've found a new corruption that could happen in 2.14.0+, too.

or,

ii) you've found a new problem in my workaround attempts for
uninitialized truelength in <=2.13.2. That might lead to unexpected
information that could lead to improvements in 2.14.0+ in unexpected
ways.

So either way it's worth following this trail, if you're ok to do so. Fast
techniques to debug the corruptions (e.g. valgrind) might come in handy in
future anyway.

Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance? I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit. I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Post by Chris Neff
On the current latest SVN build, with debugging enabled as listed
below, I get the following when trying to even print the contents of a
  'getCharCE' must be called on a CHARSXP
Never saw this error without debugging. I tried printing a few times
in a row, got this same error, and then like the 4th time it
segfaulted.
Having a hard time reproducing that, but at least it is something?
Post by Matthew Dowle
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...
sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz
R -d gdb
run
Do the stuff that crashes it.  Does it report a C file and line number?
Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Just to come back, it still crashes at seemingly random times.   I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the
latest, or?
Post by Steve Lianoglou
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure.
As
Post by Steve Lianoglou
Post by Chris Neff
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz
snapshot
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
from R-Forge won't include the fix yet. So svn up, then R CMD
build, then
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a
clean
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
install after a reboot to make sure no old .so is still knocking
around
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
somehow please. Definitely installed to the right library? If
it's
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If
it's not
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I
don't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
use [<- or $<-, the only times I modify a data.table are with :=
 or
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
by doing DT=merge(DT,blah).
Any more info I can provide?
On 15 December 2011 08:32, Matthew Dowle
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol
option just
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
to
be sure please, that would be great. You're our best hope of
confirming
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
it's fixed since it was biting you several times an hour. If
you use
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
[<-
or $<- syntax then R will copy via *tmp* and at that point the
*tmp*
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
data.table is similar to a data.table loaded from disk in that
it isn't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should'
all be
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
fine
now in both <=2.13.2 and >=2.14.0, although the bug was
something
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN
check
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
results
tick over from "ERROR" to "OK" later today (for both windows
and mac
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to
let you
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
know if it still crashes (however I do have the alloccol
option set to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
1000, so I shouldn't be bumping into reallocation very often).
Thanks
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
for finding the bug so fast!
On 14 December 2011 19:56, Matthew Dowle
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it
was in R
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
2.14. There have been quite a few fixes since 1.7.4 so if you
can
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
reproduce with 1.7.7 would be great.  Or, we've sometimes
seen that
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
just
after a package upgrade that a clean re-install can often fix
things.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Perhaps if the .so was in use by another R process or a
zombie, or
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
something. R seems to report data.table v1.7.4 (say) but it
hasn't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
fully
installed it properly and is still (perhaps partially) at
1.7.3. So
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
quit
all R (reboot to clear zombies too perhaps) and try
reinstalling
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14,
data.table
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to
data.frame for
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring
seemingly
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
at random, and I was doing operations which aren't really
what
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
data.table was made for (tons of little manipulations on
small
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
data);
still I guess I should now signal that 2.14 didn't fix
everything
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
for
me. I do not know whether bugs subsist on post-1.7.4
versions.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects
from
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to
work
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a
time.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
Post by Matthew Dowle
Post by Chris Neff
On 14 December 2011 11:43, Matthew Dowle
You're R < 2.14.0, right?  I'm really struggling in R <
2.14.0
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to
initialize
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random).
Trouble is
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for
me in
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too.
I test
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on
2.13.2
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll
continue to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit
linux then
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance
in an
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem
to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of
segfaults.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it
happens when I
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and
x is
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't
matter.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to
crash.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track
this
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-16 15:48:00 UTC
Permalink
Post by Matthew Dowle
Great, thanks. Have seen this quite a bit, see FAQ 4.3. It indicates an
earlier memory corruption happened, could have been at any point. It's not
anything to do with locale or CHARSXP. The next step is to follow all the
steps in section 4.3 of R-exts. Turn on gctorture, --use-gct,
--enable-strict-barrier, and, valgrind especially. The goal is to detect
where the earlier corruption is happening.
On the tenterhook front, 1.7.7 is now passing CRAN checks for oldrel (both
mac and windows) fully OK so that means the last fix definitely fixed the
problem I found, so that's some progress.
 i) you've found a new corruption that could happen in 2.14.0+, too.
or,
 ii) you've found a new problem in my workaround attempts for
uninitialized truelength in <=2.13.2. That might lead to unexpected
information that could lead to improvements in 2.14.0+ in unexpected
ways.
So either way it's worth following this trail, if you're ok to do so. Fast
techniques to debug the corruptions (e.g. valgrind) might come in handy in
future anyway.
Okay, maybe later today (or Monday) I will try this.
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Asked the guy who would know. Is there anyway I can find out through R?
Post by Matthew Dowle
Post by Chris Neff
On the current latest SVN build, with debugging enabled as listed
below, I get the following when trying to even print the contents of a
  'getCharCE' must be called on a CHARSXP
Never saw this error without debugging.  I tried printing a few times
in a row, got this same error, and then like the 4th time it
segfaulted.
Having a hard time reproducing that, but at least it is something?
Post by Matthew Dowle
One thought ... how about turning on debugging. That way when it crashes
at least you can report the file and line number. Btw, I've installed
2.12.0 on 64bit in case that managed to reproduce, but it still works
for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
left with you debugging at your end, but should be fairly easy ...
sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
data.table_1.7.7.tar.gz
R -d gdb
run
Do the stuff that crashes it.  Does it report a C file and line number?
Just to rule out possible svn / R CMD build strangeness, please also use
the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
for 1.7.7 so on tenterhooks for that.
Just to come back, it still crashes at seemingly random times.   I'm
reverting back to an earlier version (1.7.1) to see if that fixes my
problem.
Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.
On 15 December 2011 10:50, Steve Lianoglou
Post by Steve Lianoglou
Hi,
Out of curiosity, is it impossible for you to upgrade R to the
latest, or?
Post by Steve Lianoglou
-steve
Post by Chris Neff
I always use svn up. I'll reboot and reinstall just to make sure.
As
Post by Steve Lianoglou
Post by Chris Neff
for reproducible, it still doesn't seem to crash in any consistent
place but I'll give it a stronger try with a test data set.
All 480 tests in test.data.table() completed ok in 7.395sec
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
LC_PAPER=en_US.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
[1] stats     graphics  grDevices utils     datasets  grid
methods   base
[1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
[6] plyr_1.6
Post by Matthew Dowle
And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz
snapshot
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
from R-Forge won't include the fix yet. So svn up, then R CMD
build, then
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
R CMD INSTALL, right? (Just checking quick basics first).
Post by Matthew Dowle
Result of test.data.table(), sessionInfo() and confirm it's a
clean
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
install after a reboot to make sure no old .so is still knocking
around
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
somehow please. Definitely installed to the right library? If
it's
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
crashing a lot then it should be reproducible?
Still waiting for CRAN check results for 1.7.7 in old-rel. If
it's not
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
fixed there either that'll help to know....
Latest SVN version, no alloccol set, still crashing a lot.  I
don't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
use [<- or $<-, the only times I modify a data.table are with :=
 or
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
by doing DT=merge(DT,blah).
Any more info I can provide?
On 15 December 2011 08:32, Matthew Dowle
Post by Matthew Dowle
Great fingers and toes crossed. If you could unset alloccol
option just
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
to
be sure please, that would be great. You're our best hope of
confirming
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
it's fixed since it was biting you several times an hour. If
you use
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
[<-
or $<- syntax then R will copy via *tmp* and at that point the
*tmp*
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
data.table is similar to a data.table loaded from disk in that
it isn't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
over-allocated anymore, I realised. Also a copy() will lose
over-allocation until the next column addition.  That 'should'
all be
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
fine
now in both <=2.13.2 and >=2.14.0, although the bug was
something
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
simpler.
1.7.7 is on CRAN now and been built for windows so if CRAN
check
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
results
tick over from "ERROR" to "OK" later today (for both windows
and mac
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
old-rel), and, you're ok too, then it's fixed.
Post by Chris Neff
I've updated to the latest SVN version, and I'll be sure to
let you
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
know if it still crashes (however I do have the alloccol
option set to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
1000, so I shouldn't be bumping into reallocation very often).
Thanks
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
for finding the bug so fast!
On 14 December 2011 19:56, Matthew Dowle
Post by Matthew Dowle
Hm. Sounds like it could be a different problem then if it
was in R
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
2.14. There have been quite a few fixes since 1.7.4 so if you
can
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
reproduce with 1.7.7 would be great.  Or, we've sometimes
seen that
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
just
after a package upgrade that a clean re-install can often fix
things.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Perhaps if the .so was in use by another R process or a
zombie, or
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
something. R seems to report data.table v1.7.4 (say) but it
hasn't
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
fully
installed it properly and is still (perhaps partially) at
1.7.3. So
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
quit
all R (reboot to clear zombies too perhaps) and try
reinstalling
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
using
R
CMD INSTALL. Next time it happens I mean. Can also run
test.data.table()
to check the install.
Post by Timothée Carayol
Hi --
I have been having many unreproducible bugs with R 2.14,
data.table
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
corrupted, and then R crashed. I had to go back to
data.frame for
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
the
bits of code affected. I was doing a lot of rather unsafe
manipulations with row names, rbind and cbinds.
I didn't file a report, nor signal it, as it was occurring
seemingly
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
at random, and I was doing operations which aren't really
what
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
data.table was made for (tons of little manipulations on
small
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
data);
still I guess I should now signal that 2.14 didn't fix
everything
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
for
me. I do not know whether bugs subsist on post-1.7.4
versions.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
t
On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
Post by Matthew Dowle
Maybe, worth a try. Are you loading any data.table objects
from
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
disk?
Post by Matthew Dowle
Post by Chris Neff
64 bit 2.12.1 linux.
Is there an option I can set in my session in order to
work
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
around
the
Post by Matthew Dowle
Post by Chris Neff
truelength issue? I don't care if I lose some of the
over-allocation
Post by Matthew Dowle
Post by Chris Neff
niceties if it stops things from crashing. Looking at the
truelength
Post by Matthew Dowle
Post by Chris Neff
options(datatable.alloc=quote(1000))
stop this? I never have more than about 50 columns at a
time.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
Post by Matthew Dowle
Post by Chris Neff
On 14 December 2011 11:43, Matthew Dowle
You're R < 2.14.0, right?  I'm really struggling in R <
2.14.0
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
to
make
Post by Matthew Dowle
Post by Chris Neff
over-allocation work because R only started to
initialize
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
truelength to
Post by Matthew Dowle
Post by Chris Neff
0
in R 2.14.0+. Before that it's unitialized (random).
Trouble is
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
my
Post by Matthew Dowle
Post by Chris Neff
attempts in R < 2.14.0 to work around that work fine for
me in
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
linux
Post by Matthew Dowle
Post by Chris Neff
32bit
when I test in R 2.13.2, and I even test in 2.12.0 too.
I test
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
on
64bit
Post by Matthew Dowle
Post by Chris Neff
too but just 2.14.0.  CRAN is also showing errors on
2.13.2
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
(old-rel)
Post by Matthew Dowle
Post by Chris Neff
for
both mac and windows.
So, this is a pre-2.14.0 (only) problem that I'll
continue to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
try
and
Post by Matthew Dowle
Post by Chris Neff
fix.
Are you 64bit pre-2.14.0? Which OS?  If you are 64bit
linux then
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
it
adds
Post by Matthew Dowle
Post by Chris Neff
weight to me installing pre-2.14.0 on my 64bit instance
in an
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
effort to
Post by Matthew Dowle
Post by Chris Neff
reproduce.
Post by Chris Neff
This will be a crappy help request because I can't seem
to
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
reproduce
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
it, but the past few days I've been getting a lot of
segfaults.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 The
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
only common thing between every crash is that it
happens when I
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
do
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
DT[, z := x]
where z was not a column that existed in DT before, and
x is
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
either an
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
existing column of DT or a separate variable, doesn't
matter.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 Beyond
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
that I can't reproduce a set of steps that gets R to
crash.
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
 This
is
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
with the latest SVN version.
Is there more information I can provide to help track
this
Post by Steve Lianoglou
Post by Chris Neff
Post by Matthew Dowle
Post by Matthew Dowle
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
Post by Timothée Carayol
down?
Post by Matthew Dowle
Post by Chris Neff
Post by Chris Neff
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Chris Neff
2011-12-16 17:37:46 UTC
Permalink
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.

I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
Matthew Dowle
2011-12-17 00:27:52 UTC
Permalink
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
problem. Try this :

apt-get install valgrind (if not already installed)
R -d valgrind
require(data.table)
test.data.table()

When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.

If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance? I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit. I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
Matthew Dowle
2011-12-20 00:03:11 UTC
Permalink
Chris,

Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...

Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance? I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit. I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-20 01:12:10 UTC
Permalink
I definitely do that somewhere in my code. I'll patch tomorrow and try.
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind  (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-20 12:56:23 UTC
Permalink
So far so good. The state before this latest patch was I would run my
script, and then try to mess with the resultant data.table, and almost
immediately it would segfault. 10 minutes of playing and no segfaults
yet. Will update if there is one.
Post by Chris Neff
I definitely do that somewhere in my code. I'll patch tomorrow and try.
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind  (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-20 13:45:49 UTC
Permalink
Emailed too soon. Crashing again. I'll renable debugging and see what
comes up the next time it happens. Still isn't at all consistent as to
when exactly it crashes. I just have a script that makes a data.table
that I know will eventually crash if I use the data.table enough.
Can't reproduce on toy sets.

In regards to the valgrind request, I ran test.data.table with
valgrind on and everything passed. It timed out when trying to run my
script though, and was way way slower than normal in the process.
Post by Chris Neff
So far so good. The state before this latest patch was I would run my
script, and then try to mess with the resultant data.table, and almost
immediately it would segfault. 10 minutes of playing and no segfaults
yet.  Will update if there is one.
Post by Chris Neff
I definitely do that somewhere in my code. I'll patch tomorrow and try.
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind  (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Chris Neff
2011-12-20 13:53:47 UTC
Permalink
One random error (variable names changed).
DT[,foo:=foo]Error in `[.data.table`(DT, , `:=`(foo, foo)) :   SET_VECTOR_ELT() can only be applied to a 'list', not a 'NULL'
where foo was a vector in the global environment that I was trying to
DT
NULL
attr(,"row.names")
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37
[38] 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74
[75] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
109 110 111
[112] 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
145 146 147 148
[149] 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181
182 183 184 185
[186] 186 187 188 189 190 191 192 193 194 195 196 197 198
attr(,"class")
[1] "data.table" "data.frame"
attr(,"sorted")
[1] "x" "y"

And once again it is random. If I go through and remake DT in the
exact same session, it works fine.
Emailed too soon. Crashing again.  I'll renable debugging and see what
comes up the next time it happens. Still isn't at all consistent as to
when exactly it crashes.  I just have a script that makes a data.table
that I know will eventually crash if I use the data.table enough.
Can't reproduce on toy sets.
In regards to the valgrind request, I ran test.data.table with
valgrind on and everything passed.  It timed out when trying to run my
script though, and was way way slower than normal in the process.
Post by Chris Neff
So far so good. The state before this latest patch was I would run my
script, and then try to mess with the resultant data.table, and almost
immediately it would segfault. 10 minutes of playing and no segfaults
yet.  Will update if there is one.
Post by Chris Neff
I definitely do that somewhere in my code. I'll patch tomorrow and try.
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind  (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance?  I've used R_len_t quite a bit in data.table to future proof for
when that happens, but if you've done it already in your build then that
would help to know since it's never been tested afaik when R_len_t != int
on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Matthew Dowle
2011-12-20 14:32:47 UTC
Permalink
Can you provide the output of str(DT) please? Off list or obfuscated, or
both is fine. And the commands you're running on it also. The row.names
look suspect for example. A saved object, even better. As soon as I have
data (or at least detailed structure) and commands then so far it's been
quite quick to find and fix.

Thanks for running valgrind. I've (now) realised that for these types of
bug valgrind probably isn't going to help. I ran also for the last 2 and
it didn't find them.

Other things that might give clues, are to set all these on before running
the script :

gcinfo(TRUE)
options(datatable.verbose=TRUE)

And finally a wild stab in the dark ... are you 'plonking' with :=? i.e.
adding or replacing a column by providing a RHS which is as long as the
rows in the table so can be 'plonked' straight in? Because foo:=foo
would do that, but do you really mean assigning foo to itself ? Copying a
column to a new name is perhaps something I haven't considered or tested.
Post by Chris Neff
One random error (variable names changed).
DT[,foo:=foo]Error in `[.data.table`(DT, , `:=`(foo, foo)) :  
SET_VECTOR_ELT() can only be applied to a 'list', not a 'NULL'
where foo was a vector in the global environment that I was trying to
DT
NULL
attr(,"row.names")
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37
[38] 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74
[75] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
109 110 111
[112] 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
145 146 147 148
[149] 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181
182 183 184 185
[186] 186 187 188 189 190 191 192 193 194 195 196 197 198
attr(,"class")
[1] "data.table" "data.frame"
attr(,"sorted")
[1] "x" "y"
And once again it is random. If I go through and remake DT in the
exact same session, it works fine.
Emailed too soon. Crashing again.  I'll renable debugging and see what
comes up the next time it happens. Still isn't at all consistent as to
when exactly it crashes.  I just have a script that makes a data.table
that I know will eventually crash if I use the data.table enough.
Can't reproduce on toy sets.
In regards to the valgrind request, I ran test.data.table with
valgrind on and everything passed.  It timed out when trying to run my
script though, and was way way slower than normal in the process.
Post by Chris Neff
So far so good. The state before this latest patch was I would run my
script, and then try to mess with the resultant data.table, and almost
immediately it would segfault. 10 minutes of playing and no segfaults
yet.  Will update if there is one.
Post by Chris Neff
I definitely do that somewhere in my code. I'll patch tomorrow and try.
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind  (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does
it
Post by Chris Neff
Post by Matthew Dowle
increase R_len_t on 64bit to allow longer vectors than 2^31, by
any
Post by Chris Neff
Post by Matthew Dowle
chance?  I've used R_len_t quite a bit in data.table to future
proof for
Post by Chris Neff
Post by Matthew Dowle
when that happens, but if you've done it already in your build
then that
Post by Chris Neff
Post by Matthew Dowle
would help to know since it's never been tested afaik when
R_len_t != int
Post by Chris Neff
Post by Matthew Dowle
on 64bit.  I'm also assuming R_len_t is signed. If your R has
R_len_t as
Post by Chris Neff
Post by Matthew Dowle
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about
that
Post by Chris Neff
stuff, we will be upgrading to 2.14 in the next few months
apparently,
Post by Chris Neff
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Timothée Carayol
2011-12-20 13:22:52 UTC
Permalink
Hi -- yes, that could well be the case. My frequency of bugs was nowhere
near that of Chris Neff though, but maybe the cause was the same. My
problems were only in one specific one-off application, for which I rewrote
the code in a data.frame-esque way, and was a rather heavy job in the first
place -- so it would be rather inconvenient to test this fix. I haven't had
any data.table problem in a weeks.

Cheers
Post by Matthew Dowle
Chris,
Are you returning any character or list() columns in j when grouping? If
so, Jim Holtman provided a reproducible example and a fix has just been
committed. Same errors / seg faults, and, for R >= 2.14.0, not just R <
2.14.0. Could this also be the same problem Timothée Carayol mentioned?
Fingers crossed ...
Matthew
Post by Matthew Dowle
It'd be good to get to the bottom of it in case it's not a pre-2.14.0
apt-get install valgrind (if not already installed)
R -d valgrind
require(data.table)
test.data.table()
When I do this it runs very slowly but eventually completes ok with just
test 120 failing. Test 120 is a timing test, which takes longer because
of valgrind mode, so that's ok. Ignore the valgrind messages for R
itself that happen before R's banner comes up.
If you get the same, then proceed to run your tests that crash it.
Hopefully you'll get some messages at the point the corruption occurs.
Post by Chris Neff
Post by Matthew Dowle
Only other thought ... your special internal build of R ... does it
increase R_len_t on 64bit to allow longer vectors than 2^31, by any
chance? I've used R_len_t quite a bit in data.table to future proof
for
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
when that happens, but if you've done it already in your build then
that
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
would help to know since it's never been tested afaik when R_len_t
!= int
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
on 64bit. I'm also assuming R_len_t is signed. If your R has
R_len_t as
Post by Matthew Dowle
Post by Chris Neff
Post by Matthew Dowle
unsigned would need to know.
Answer to this is no, we haven't touched that.
I'm happy to keep helping, but if you'd rather not worry about that
stuff, we will be upgrading to 2.14 in the next few months apparently,
and I can live with 1.7.1 until then.
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
Loading...