data structure for different subpopulations within the AgentPopulation class #42

sbenthall · 2021-11-11T19:57:19Z

Currently all the agent objects in AgentPopulation are just in a list.

That makes:

aggregating their stats clumsy ( make tracked macro variables and history available from the AgentPopulation #36 (comment) )
it harder to share a solution between them ( Make agents longer lived but still finite #39 )

This should be handled with a nicer data structure that makes these other operations cleaner.

Note that the current way things are done is partly because of how HARK's distribute_params method works. distribute() in this repository is contorted around HARK's distribute_params() method:

https://github.com/sbenthall/HARK_ABM_INTRO_public/blob/master/HARK/hark_portfolio_agents.py#L55

This is the underlying function in HARK that could well be rewritten:

https://github.com/econ-ark/HARK/blob/master/HARK/core.py#L1664

The text was updated successfully, but these errors were encountered:

sbenthall · 2021-11-23T19:56:30Z

Another thing this class could do, which would serve a much more general purpose, is help refactor this sort of realistic definition of a population:

https://github.com/econ-ark/DistributionOfWealthMPC/blob/master/Code/SetupParamsCSTW.py

sbenthall · 2021-11-23T20:01:12Z

See this related issue. In the current agent population code, I rescaled the parameters with ad hoc code. But this could be done with a more general utility added to HARK.

econ-ark/HARK#995

alanlujan91 · 2021-11-24T00:19:39Z

I will discuss design here.

Current HARK agents are heterogeneous with respect to their states (cash-on-hand m, income p, assets a) but homogeneous with respect to their parameters (ex-ante identical; same preferences of CRRA, DiscFac, stock market expectations, etc).

What we need is an AgentPopulation class that allows for heterogeneity of preferences and/or beliefs (as a start, maybe others in the future).

Generically this AgentPopulation takes as inputs what parameters are to be heterogeneous, and what the distribution of those parameters are. For example: CRRA -> [bot, top, n] results in uniform distribution of agents with respect to their CRRA preferences. Other distributions could be desirable, as well as different discretizations.

For our purposes, we are thinking of varying [CRRA, DiscFac, RiskyAvg, RiskyStd]. AgentPopulation should create a grid of agents of size (CRRA_n, DiscFac_n, RiskyAvg_n, RiksyStd_n) where AgentPopulation.__sub_agent__[i,j,k,l] = PortfolioConsumerType(CRRA[i], DiscFac[j], RiskyAvg[k], RisyStd[l]). The sub-agent classes in this case are just parameter holders which describe their contained models, and should not carry agents and simulations themselves. AgentPopulation should instead hold the agents and the simulation, where [CRRA, DiscFac, RiskyAvg, RiskyStd] become states themselves.

Assuming we have created an AgentPopulationSolution object (which I will discuss below) an agent would transition by calling aNrm = mNrm - solution[t].cFunc(mNrm, CRRA, DiscFac, RiskyAvg, RiskyStd). If we are careful about compartmentalizing the parameters that will actually change during simulation (RiskyAvg, RiskyStd), this could even be reduced to aNrm = mNrm - solution[t].cFunc(mNrm, RiskyAvg, RiskyStd). In essence, RiskyAvg and RiskyStd also become state variables that evolve over the simulation.

Now to AgentPopulationSolution. In AgentPopulation we created a grid of subagents. Let's assume that our population is discrete along [CRRA, DiscFac] but continuous along [RiskyAvg, RiskyStd]. The AgentPopulationSolution would traverse the grid and solve every single sub agent, which gives (CRRA_n * DiscFac_n * RiskyAvg_n * RiksyStd_n) different solution objects. AgentPopulationSolution now has the task of "stitching" all of these solutions together to make a Population Solution.

Continuing with the example, for every [CRRA, DiscFac] which are discrete in the population, our solution depends on [RiskyAvg, RiskyStd] and mNrm. We already created cFunc(mNrm), so now we create an interpolator such that we can have cFunc(m, avg, std). Going back to what I wrote earlier, once this "stitching" is complete, the way to access the solution could be cNrm = solution[t, CRRA, DiscFac].cFunc(mNrm, RiskyAvg, RiskyStd), where [t, CRRA, DiscFac] are exogenous or deterministic states, and [mNrm, RiskyAvg, RiskyStd] are endogenous and evolving states.

alanlujan91 · 2021-11-24T00:30:43Z

Another important source of ex-ante heterogeneity is income processes for different education classes, which is more of what cstwMPC does.

sbenthall · 2021-11-29T21:03:23Z

I think all this is great.

One thing I'll add is that the AgentPopulation should be initialized with configurable Distributions for varying parameters.
The current implementation assumes Uniform distributions.

Also, the parameters determining the shape of the distribution (top and bottom for Uniform, mean and std for Normal, etc.) should be separated from the approximation parameter (the n for number of values to discretize the distribution into).

So the initial parameterization of the AgentPopulation should be:

a dictionary of parameters, whose values are either
- scalars for fixed values
- a data structure [current lists, but could be more specialized data class] for time-varying values
- a continuous Distribution, fully parameterized
- A time-varying distribution? See https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L33
- a data structure (maybe a dictionary to start) for a categorically varying parameter (such as by education level). Values for each category could be:
  - a scalar
  - a time-varying value
  - a distribution...

It would actually make sense for there to be an AbstractAgentPopulation (or TrueAgentPopulation, or something) that takes only these parameters, which then generates a discretized or approximate AgentPopulation when given:

n for all its continuous distributions
grid parameters for continuous state variables.

@nicksawhney could get started on the first part of this.

sbenthall · 2021-11-29T21:15:02Z

For context, this issue in HARK represents a design ideal that Chris feels strongly about. As long as we are writing new code/designs, it makes sense to model this new design.

econ-ark/HARK#914

sbenthall · 2021-12-09T16:36:43Z

@llorracc has some draft work on functionality like this in his "2.0 pre-ALPHA" HARK PR:

https://github.com/econ-ark/HARK/blob/3ba91db642bd0394ef93b414cbd27f98fcdaf56f/HARK/ConsumptionSaving/ConsIndShockModel_AgentTypes.py

Note especially prmtv_par and aprox_lim as separate namespaces within the parameters within AgentTypePlus.

sbenthall · 2021-12-09T20:10:18Z

Note the very interesting part of the @llorracc implementation that uses progressively granular approximations to accelerate discovery of the solution.

llorracc · 2021-12-10T05:29:24Z

I feel strongly that we need to refine our technology for defining models; I think a model is not well defined without some unambiguous specification of what idealized object the approximations are approximating. Seb's idea of taking the number of approximating points as an input seems like a sensible one.

I'd argue, though, for a somewhat more flexible approach than the structure Seb describes. In particular, I think that we should separate the machinery for describing the distribution from the machinery for organizing the information that the machinery needs.

That is, at each point where a distribution needs to be generated, the code's endpoint should be a call to some user-defined function (e.g., make_parameter_distribution(parameter_name,distribution_description,time_description)) and the distribution_description would contain the info needed to construct the approximation.

Like, distribution_description might contain:

The name of a class that describes the distribution (Say, DiscreteApproxToMeanOneLogNormalTruncated)
The actual inputs that the class needs (variance, number of approximating points, method of approximation)
The limiting characteristics if computational resources were infinite

The upshot is that the first priority should be to improve and standardize our tools for describing any particular distribution. Only when that is done will we know what inputs we generically need to keep track of for the larger description.

PS. Another logically prior step is to settle any outstanding questions about how we want to keep track of time/date/epoch/age/subperiod.

sbenthall · 2021-12-13T19:04:38Z

Hi @llorracc . I'm not sure I follow what you're saying entirely. What do you mean by time_description ?

sbenthall · 2021-12-13T19:08:10Z

Also, @llorracc I think that because of the timeline for development around SHARKFin, this repository is going to need to err on the side of imperfect but functioning implementations, as opposed to building off of perfect "HARK 2.0" implementations.

I know that for HARK 2.0 you want a lot of generality in problem representation which isn't in the current (pre-1.0) version of HARK. I think we can make a lot of progress building towards 1.0 without taking on the full 2.0 scope.

sbenthall · 2021-12-14T20:55:05Z

I confused myself about this, but another point to clarify here is that the distributions in the current use case are specifically over the population of agents (i.e., the agent count with each CRRA level) as opposed to being probability distributions for exogenous shocks.

sbenthall · 2021-12-17T17:33:28Z

Summary of meeting about this with @llorracc 👍

we agree that model parameters should be given a class that describes the distribution, and its arguments/parameters
ideally discretization methods are decoupled from the distribution classes. This requires a change to HARK. Discretization methods as a general type of object/function that takes Continuous Distributions econ-ark/HARK#1091
SHARKFin won't block on HARK developments but we should both out towards 'ideal' designs.
Currently that HARK internals uses Python primitives __getitem__ and __iter__ for 'time-varying' parameters. This has led to the implementation of an IndexDistribution in HARK for representing a time-varying distribution. https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L33 This is not the ideal way to represent time-varying parameters because of its ambiguities when used across finite, infinite, and seasonal problems. Again, this is a case where core HARK improvements are required for "ideal" SHARKFin behavior. But SHARKFin can begin by supporting a limited set of agents/problems.

sbenthall · 2021-12-30T19:34:10Z

Python has a system for creating data types that are not as heavy weight as classes:
https://docs.python.org/3/library/typing.html

These could be used for "time varying" parameters. There may be many ways to improve HARK with this set of language features.

sbenthall · 2022-01-24T18:51:21Z

Earlier, I put a design document for this new class here:
https://github.com/sbenthall/SHARKFin/blob/master/design/AgentPopulationDesignDocument.ipynb

Feel free to use that notebook for further work designing this AgentPopulation class.

alanlujan91 · 2022-01-24T18:55:43Z

I've been looking at this and I'm sketching an idea

Going back to your typing suggestions tho, type aliases and new types seem to be intended for static linting, but can't type check at run time, right? Is there something else in that page that I should be looking at?

sbenthall · 2022-01-24T19:14:36Z

This is a good guide to types in Python: https://realpython.com/python-type-checking/

Yes, Python is still dynamically typed even with type hints.
The static check imposes clarity on the architecture.
It is possible to use explicit type checks in the software itself if it's functionally important.

nicksawhney · 2022-02-10T00:43:57Z

What's the status of this issue? We created the AgentList object as a temporary data structure to handle groups of different agents in the future, but it seems like much of the discussion has moved to @alanlujan91 's new agent population code. Should we close this issue?

sbenthall · 2022-02-10T13:14:07Z

@nicksawhney It's true issue #52 is proposed as a solution to this. #52 is still in progress.

I prefer to leave issues open as TODO items until they are settled by a final PR's. PR's are options to settle issues. This is to some extent just a matter of convention/style.

sbenthall · 2022-03-31T19:59:53Z

Closed with #52

sbenthall added this to the v0.2 milestone Nov 23, 2021

alanlujan91 mentioned this issue Nov 24, 2021

Make agents longer lived but still finite #39

Closed

sbenthall self-assigned this Dec 7, 2021

sbenthall mentioned this issue Dec 9, 2021

(HARK) Population defined by a configuration file #11

Closed

sbenthall mentioned this issue Dec 30, 2021

Parameter typing (more support for time varying elements) econ-ark/HARK#664

Closed

sbenthall assigned alanlujan91 and unassigned sbenthall Jan 20, 2022

sbenthall closed this as completed Mar 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data structure for different subpopulations within the AgentPopulation class #42

data structure for different subpopulations within the AgentPopulation class #42

sbenthall commented Nov 11, 2021

sbenthall commented Nov 23, 2021

sbenthall commented Nov 23, 2021

alanlujan91 commented Nov 24, 2021 •

edited

Loading

alanlujan91 commented Nov 24, 2021

sbenthall commented Nov 29, 2021

sbenthall commented Nov 29, 2021

sbenthall commented Dec 9, 2021

sbenthall commented Dec 9, 2021

llorracc commented Dec 10, 2021 •

edited

Loading

sbenthall commented Dec 13, 2021

sbenthall commented Dec 13, 2021

sbenthall commented Dec 14, 2021

sbenthall commented Dec 17, 2021

sbenthall commented Dec 30, 2021

sbenthall commented Jan 24, 2022

alanlujan91 commented Jan 24, 2022

sbenthall commented Jan 24, 2022

nicksawhney commented Feb 10, 2022

sbenthall commented Feb 10, 2022

sbenthall commented Mar 31, 2022

data structure for different subpopulations within the AgentPopulation class #42

data structure for different subpopulations within the AgentPopulation class #42

Comments

sbenthall commented Nov 11, 2021

sbenthall commented Nov 23, 2021

sbenthall commented Nov 23, 2021

alanlujan91 commented Nov 24, 2021 • edited Loading

alanlujan91 commented Nov 24, 2021

sbenthall commented Nov 29, 2021

sbenthall commented Nov 29, 2021

sbenthall commented Dec 9, 2021

sbenthall commented Dec 9, 2021

llorracc commented Dec 10, 2021 • edited Loading

sbenthall commented Dec 13, 2021

sbenthall commented Dec 13, 2021

sbenthall commented Dec 14, 2021

sbenthall commented Dec 17, 2021

sbenthall commented Dec 30, 2021

sbenthall commented Jan 24, 2022

alanlujan91 commented Jan 24, 2022

sbenthall commented Jan 24, 2022

nicksawhney commented Feb 10, 2022

sbenthall commented Feb 10, 2022

sbenthall commented Mar 31, 2022

alanlujan91 commented Nov 24, 2021 •

edited

Loading

llorracc commented Dec 10, 2021 •

edited

Loading