squiggle.c

Self-contained Monte Carlo estimation in C99
Log | Files | Refs | README

ROADMAP.md (4787B)


      1 # Roadmap
      2 
      3 ## To do
      4 
      5 - [x] Big refactor
      6   - [ ] Come up with a better headline example; fermi paradox paper is too complicated
      7   - [x] Make README.md less messy
      8   - [x] Give examples of new functions
      9   - [x] Reference commit with cdf functions, even though deleted
     10 - [ ] Post on suckless subreddit
     11 - [ ] Look into <https://lite.duckduckgo.com/html/> instead?
     12 - [ ] Drive in a few more real-life applications
     13   - [ ] US election modelling?
     14 - [ ] Look into using size_t instead of int for sample numbers
     15 - [ ] Reorganize code a little bit to reduce usage of gcc's nested functions
     16 - [ ] Rename examples
     17 
     18 ## Done
     19 
     20 - [x] Document print stats
     21 - [x] Document rudimentary algebra manipulations for normal/lognormal
     22 - [x] Think through whether to delete cdf => samples function => not for now
     23 - [x] Think through whether to:
     24   - simplify and just abort on error
     25   - complexify and use boxes for everything
     26   - leave as is
     27   - [x] Offer both options
     28 - [x] Add more functions to do algebra and get the 90% c.i. of normals, lognormals, betas, etc.
     29   - Think through which of these make sense.
     30 - [x] Systematize references
     31 - [x] Think through seed initialization
     32 - [x] Document parallelism
     33 - [x] Document confidence intervals
     34 - [x] Add example for only one sample
     35 - [x] Add example for many samples
     36 - [x] Use gcc extension to define functions nested inside main.
     37 - [x] Chain various `sample_mixture` functions
     38 - [x] Add beta distribution
     39   - See <https://stats.stackexchange.com/questions/502146/how-does-numpy-generate-samples-from-a-beta-distribution> for a faster method.
     40 - [x] Use OpenMP for acceleration
     41 - [x] Add function to get sample when given a cdf
     42 - [x] Don't have a single header file.
     43 - [x] Structure project a bit better
     44 - [x] Simplify `PROCESS_ERROR` macro
     45 - [x] Add README
     46   - [x] Schema: a function which takes a sample and manipulates it,
     47   - [x] and at the end, an array of samples.
     48   - [x] Explain boxes
     49   - [x] Explain nested functions
     50   - [x] Explain exit on error
     51   - [x] Explain individual examples
     52 - [x] Rename functions to something more self-explanatory, e.g,. `sample_unit_normal`.
     53 - [x] Add summarization functions: mean, std
     54 - [x] Add sampling from a gamma distribution
     55   - https://dl.acm.org/doi/pdf/10.1145/358407.358414
     56 - [x] Explain correlated samples
     57 - [x] Test summary statistics for each of the distributions.
     58   - [x] For uniform
     59   - [x] For normal
     60   - [x] For lognormal
     61   - [x] For lognormal (to syntax)
     62   - [x] For beta distribution
     63 - [x] Clarify gamma/standard gamma
     64 - [x] Add efficient sampling from a beta distribution
     65   - https://dl.acm.org/doi/10.1145/358407.358414
     66   - https://link.springer.com/article/10.1007/bf02293108
     67   - https://stats.stackexchange.com/questions/502146/how-does-numpy-generate-samples-from-a-beta-distribution
     68   - https://github.com/numpy/numpy/blob/5cae51e794d69dd553104099305e9f92db237c53/numpy/random/src/distributions/distributions.c
     69 - [x] Pontificate about lognormal tests
     70 - [x] Give warning about sampling-based methods.
     71 - [x] Have some more complicated & realistic example
     72 - [x] Add summarization functions: 90% ci (or all c.i.?) 
     73 - [x] Link to the examples in the examples section.
     74 - [x] Add a few functions for doing simple algebra on normals, and lognormals
     75   - [x] Add prototypes
     76   - [x] Use named structs
     77   - [x] Add to header file
     78   - [x] Provide example algebra
     79   - [x] Add conversion between 90% ci and parameters.
     80   - [x] Use that conversion in conjunction with small algebra.
     81   - [x] Consider ergonomics of using ci instead of c_i
     82     - [x] use named struct instead
     83     - [x] demonstrate and document feeding a struct directly to a function; my_function((struct c_i){.low = 1, .high = 2});
     84   - [x] Move to own file? Or signpost in file? => signposted in file.
     85 - [x] Write twitter thread: now [here](https://twitter.com/NunoSempere/status/1707041153210564959); retweets appreciated.
     86 - [x] Write better confidence interval code that:
     87   - Gets number of samples as an input
     88   - Gets either a sampler function or a list of samples
     89   - is O(n), not O(nlog(n))
     90   - Parallelizes stuff
     91 
     92 ## Discarded
     93 
     94 - [ ] ~~Disambiguate sample_laplace--successes vs failures || successes vs total trials as two distinct and differently named functions~~
     95 - [ ] ~~Support all distribution functions in <https://www.squiggle-language.com/docs/Api/Dist>~~
     96 - [ ] ~~Add a custom preprocessor to allow simple nested functions that don't rely on local scope?~~
     97 - [ ] ~~Add tests in Stan?~~
     98 - [ ] ~~Test results for lognormal manipulations~~
     99 - [ ] ~~Consider desirability of defining shortcuts for algebra functions. Adds a level of magic, though.~~
    100 - [ ] ~~Think about whether to write a simple version of this for [uxn](https://100r.co/site/uxn.html), a minimalist portable programming stack which, sadly, doesn't have doubles (64 bit floats)~~