One huge downside of #Haskell is an often insane memory footprint. Data.Map with a mere million entries can easily occupy a gigabyte or more. I'm not sure what to do with this problem other than start rewriting critical sections of code in either C++ or Rust.
@pureevil It's due to the "laziness", I suppose. But, AFAIK, there are tricks to make the computations more "eager".
@amiloradovsky not sure this is the case, since I used deepseq, and heap profiling showed that the actual data size was a lot smaller. The actual memory footprint of the program however was that abysmal number I mentioned in my first post.
This seems to me like a case of memory fragmentation, but I don't know for sure until I dig further.
@pureevil @amiloradovsky According to: https://wiki.haskell.org/GHC/Memory_Footprint, on a 64-bit machine Map occupies (per element) 48 bytes + the size of the key + the size of the value, which is quite a lot yeah :/
What do you mean by memory fragmentation? Isn't GHC's GC supposed to avoid that?
@typochon @amiloradovsky 48 bytes per entry still gives out only 48 megabytes per million entries, which is a lot less than I have observed. My case actually involves a different more specialized container, but Data.Map gives values within the same order of magnitude.
GC is supposed to fight fragmentation, yeah, but I'm not entirely sure this isn't its fault. There have been bugs in RTS. Even I reported a few.
@pureevil @amiloradovsky I suppose it's a price of you pay for the purity. Look at the mutable dictionaries, they have much better memory properties https://github.com/haskell-perf/dictionaries
@4e6 @amiloradovsky yup, implementing a mutable version would be a good idea
@pureevil "unboxing"? (forcing the compiler to use a more "economic" representations)