Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scaling #3576

Open
4 tasks
kainagel opened this issue Nov 21, 2024 · 14 comments
Open
4 tasks

scaling #3576

kainagel opened this issue Nov 21, 2024 · 14 comments
Assignees

Comments

@kainagel
Copy link
Member

Should include the following:

  • introduce "scaling" config param in "global" config section.
  • change the "dependent" params (flowCapFactor, storCapFactor, countsScaleFactor, ...) such that they only correct the global parameter. This could either be as an additional multiplicator. Or as an override. The latter might be easier to refactor, since one could set the default to "null" and then, if these params are not set explicitly, they are set from the global param. On the other hand, this would mean automagic, which in general we do not like.
  • Make sure that outputs are scaled. For example, emissionsByM should be scaled up to 100%.
  • For the "tables" (trips, persons, legs, acitivites, etc.), we could add "weights" to the individual persons. For example, for a scaling of 0.1 the weights would be uniformly "10". Many surveys have weights for individual records, so use the same approach here.
@vsp-gleich
Copy link
Contributor

I strongly support adding a config param scaling in golbal config group. We already have in SimWrapperConfigGroup a param sampleSize, but simwrapper contrib is not used by everyone and we shouldn't depend on it. But it underlines that others saw the need for such a param already. It would be much nicer to have something in global config we can depend on.

@jfbischoff
Copy link
Collaborator

I'm also in favor - and yes, we use internally a parameter like this.
Storage and FlowCapacity parameters are often handled from a gut-feeling. I think these should be set by the global parameter per default (that is what one would expect, if one is new to MATSim, so I would not consider it automagic) and can be overwritten if explicitly required.

@Janekdererste
Copy link
Member

When we introduce this new factor, should this follow the current logic, where the scaling factors reduce the capacities on the network, or should we interpret the scaling factor the other way round?

I would favor a scaling factor of 10 for a 10% sample, since our input material, i.e. the population is already scaled down, and we need to interpret the simulation results as if they were 10 times bigger.

@marecabo
Copy link
Contributor

marecabo commented Dec 5, 2024

I would favor a scaling factor of 10 for a 10% sample, since our input material, i.e. the population is already scaled down, and we need to interpret the simulation results as if they were 10 times bigger.

I would recommend to describe "what is" in the config (e.g., scale=0.1 for a 10% scenario) instead of "what should be done to scale up to 100%". Currently, the two main parameters storage and flow cap also describe what is present in the scenario. Introducing a new way of interpreting parameters might cause confusion. Also, I guess, it might not be guaranteed, that all outputs quantities can be scaled up linearily.

@jfbischoff
Copy link
Collaborator

Mhm, a matter of taste, I'd say. Also a matter of perspective - we tend to use a 100% population and sample it down at the beginning of a simulation 😆
But either way is fine for me, though my preference would be the decimal.

@vsp-gleich vsp-gleich self-assigned this Dec 9, 2024
@vsp-gleich
Copy link
Contributor

vsp-gleich commented Dec 9, 2024

I am currently working on this issue in our Matsim Advanced lecture.

change the "dependent" params (flowCapFactor, storCapFactor, countsScaleFactor, ...) such that they only correct the global parameter. This could either be as an additional multiplicator. Or as an override. The latter might be easier to refactor, since one could set the default to "null" and then, if these params are not set explicitly, they are set from the global param.

This is harder than I thought. I haven't found any example where in config group A we depend on values from config group B. Not sure if this would even work if we cannot depend on having being read in a certain order. We could work around that and replace e.g. all calls of qSimConfigGroup.getFlowCapFactor() with something else, but qSimConfigGroup.getFlowCapFactor() is used all around matsim-libs and certainly also outside.
Anyway, I would favor interpreting flowCapacityFactor than as an overwrite and not as an multiplicator.

@vsp-gleich
Copy link
Contributor

@paulheinr Maybe you have some intuition how to solve the QSimConfigGroup depends on GlobalConfigGroup issue?

@vsp-gleich
Copy link
Contributor

While upscaling the simulation results it is not clear how to deal with vehicle volumes.

Public transit vehicles usually operate at 100% independent from population sample size. Therefore public transit volumes should not be scaled up. For drt there are different approaches to model sample sizes (less seats per vehicle or smaller fleet size) and upscaling is not necessarily linear.

I don't know how scaling is handled in freight traffic. @rewertvsp ?

In the mobsim we can reduce the capacity consumption of buses by setting small pcuEquivalents. However, pcuEquivalents can also be used to model that a truck uses more capacity than a passenger car. So this does not seem to be a reliable measure to define which vehicles are to be scaled up and which are not.

We can

  1. ignore the problem and not scale up any vehicle volume for the time being. However, I think most users would wrongly expect vehicle volumes to be scaled up just as we are planning to do with passenger numbers, trips etc.
  2. modify the VolumesAnalyzer to count vehicles of certain defined modes separately (default is public transit and drt, but will probably need to be configurable) and scale up only all others. Maybe better than 1.), but still not optimal with drt. This needs the vehicles of the defined modes to be easily identifiable.

@paulheinr
Copy link
Contributor

I would recommend to describe "what is" in the config (e.g., scale=0.1 for a 10% scenario) instead of "what should be done to scale up to 100%". Currently, the two main parameters storage and flow cap also describe what is present in the scenario. Introducing a new way of interpreting parameters might cause confusion. Also, I guess, it might not be guaranteed, that all outputs quantities can be scaled up linearily.

I agree with that. I think the question is, whether the demand or the supply is the reference. When we build models, our input network is fixed and we have a variable population (size). Thus, in my opinion the scale factor should have as reference the adjustments of the network in order to match the given population.

@paulheinr
Copy link
Contributor

@paulheinr Maybe you have some intuition how to solve the QSimConfigGroup depends on GlobalConfigGroup issue?

Good question. We have the interface org.matsim.core.config.consistency.ConfigConsistencyChecker. When called, the whole config should be present. Within a consistency checker, you could also adapt the config. Maybe this is an option?

@Janekdererste
Copy link
Member

Janekdererste commented Dec 12, 2024

We are currently developing yet another mobsim, where we decided not to scale down network capacities, but to scale the pce of vehicles. This means we only have one scaling factor, similar to the one proposed here. Also, we scale up the pce of passenger cars, but do not scale the pce of transit vehicles. This way, a single passenger car represents multiple vehicles depending on the scaling factor, but transit vehicles do not, as they usually run with their original schedule.

This has the following differences compared to the current implementation:

  • The responsibility of scaling is moved into the vehicle sources. This allows distinct scaling of different vehicle types.
  • pce values of VehicleTypes remain the same in input files regardless of the scaling used
  • The traffic simulation code is free of capacity scaling logic, as this is handled elsewhere. The QSim has a lot of places where capacities are adjusted, due to scaling factors

This does not solve the issue of scaling the passenger capacities of vehicles. We have left this property untouched so far. However, vehicle sources are now free to adjust passenger capacities per vehicle type.

@tschlenther
Copy link
Contributor

Just a little remark, that some code on the analysis side already has their own scaling config parameters.
Look at NoiseConfigGroup and AccidentsConfigGroup for examples.
Those should ideally be replaced or filled by the new global parameter, as well.
Of course this would be a step to take after sorting out everything in the core.

@vsp-gleich
Copy link
Contributor

I just talked to @rewertvsp. For freight there are multiple approaches: keep all freight demand, vehicles and tours at 100% or downsample the number of vehicles and tours. This then intersects with having either no passenger demand at all in the model or various population sample sizes. So we have scenarios with 100% freight demand and 3% population (e.g. RVR) and others with 10% freight demand and 10% population.

That means we can have multiple different scaling factors in the same scenario:

  1. population sample size (passenger demand)
  2. freight demand sample size
  3. network flow / storage capacity factor
  4. vehicle capacity factors: 100% of all public transit vehicles are simulated, but actually they should have a reduced passenger capacity (currently not implemented). Similar arguments might be made for drt and freight vehicles.

I like @Janekdererste 's approach in the new mobsim that each vehicle type should provide their scaling factor for network capacity utilisation, in addition to pce. Talking of scaling in terms of a decimal value representing the sample size is confusing in this case, so I would switch to using an upscaleFactor. This would make for the following example vehicle types:

  • vehicle type car networkCapacityScaling=10 (10% population sample, so each car represents 10 cars), pce = 1.0, vehicleCapacityScaling=1
  • vehicle type bike networkCapacityScaling=10 (10% population sample, so each bike represents 10 bikes), pce = 0.5, vehicleCapacityScaling=1
  • vehicle type bus networkCapacityScaling=1, pce=2, vehicleCapacityScaling=0.1 (10% population sample, so only 1 in 10 seats should be usable in the mobsim. Maybe add 1 seat to reduce scaling artefacts).
  • vehicle type truck40t networkCapacityScaling=1, pce=3, vehicleCapacityScaling=1
  • vehicle type drt networkCapacityScaling=messy (10% population sample, but drt fleet size is somewhere between 10% and 100% and they drive more km/request due to less bundling), pce = 1.0, vehicleCapacityScaling=messy

Then we could eliminate network flowCapacityFactor and storageCapacityFactor, right? However, this also removes the possibility to scale flowCapacity and storageCapacity differently. I think we currently don't use that. Still sooner or later someone might miss it (e.g. to combat scaling artefacts while using a very small population sample). So should we set flowCapacity and storageCapacity separately per vehicle type?

It would be nice if networkCapacityScaling and vehicleCapacityScaling could be set via config, so as to use the same vehicle input files no matter which population sample size is used. For the output analisys we would need parameters for population sample size and freight demand sample size anyway.

This would need some association between vehicle types and their usages. Currently, only transit vehicles would need vehicleCapacityScaling and they are set in a separate transitVehicles file. But for networkCapacityScaling there is no identifier to tell that vehicle types car and bike are used for passenger transport and should have networkCapacityScaling set according to population sample size and truck40t should not be scaled.

My first approach was adding attributes per VehicleType networkCapacityScalingType=upScaleForPopulationSampleSize/upScaleForFreightSampleSize/doNotScale and vehicleCapacityScalingType=downScaleToPopulationSampleSize/downScaleToFreightSampleSize/doNotScale
But this is probably not flexible enough to deal with drt scaling.

Instead, we might define an interface VehicleScaling and assign one implementation to a vehicle type. E.g.

  • vehicle type car has vehicleScaling=PassengerVehicleOnNetworkScaling (upscales networkCapacityUtilisation according to populationSampleSize, does not modify vehicleCapcity)
  • vehicle type bus has vehicleScaling=TransitVehicleScaling (downscales vehicleCapacity according to populationSampleSize, does not modify networkCapacityUtilisation)
  • vehicle type drt has vehicleScaling=CustomDrtVehicleScaling (does weird custom adjustments)

For analysing outputs we need to differentiate passenger agents and freight agents. This can probably be done using the subpopulations.

So the current proposal to cater for everything would be:

  1. PlansConfigGroup has a parameter passengerDemandSampleSize (or populationSampleSize)=decimal 0..1 and a parameter passengerSubpopulations=berlinPassenger,brandenburgPassenger (and freight agents need their own separate subpopulation so as to exclude them here)
  2. FreightCarriersConfigGroup has a parameter freightDemandSampleSize=decimal 0..1 and a parameter freightSubpopulations=freightGrocery,freightKEP
  3. VehicleTypes have a parameter vehicleScaling
  4. interface VehicleScaling upscales networkCapacityUtilisation and downscales vehicleCapacity based on populationSampleSize and freightDemandSampleSize. Some default implementations are provided, but custom implementations especially for drt can be used.
  5. QSimConfigGroup flowCapacityFactor and storageCapacityFactor are removed
  6. Output analysis weights vehicle volumes (and noise etc.) based on networkCapacityUtilisation set in VehicleTypes via VehicleScaling interface
  7. Output analysis weights trips (and pax-km, etc.) based on passengerSubpopulations and populationSampleSize

PassengerDemandSampleSize could be moved to the plans file itself since we tend to have one plans file per sample size. But this gets confusing if the sample is reduced after reading a plans file or if the population is created in code.
A more radical approach would be to assign each person a weight (somewhat in line with what @kainagel proposed for the output trips.csv and person.csv). However, this would make analysis much more complicated, especially if persons can have different weights.

All in all this gets much more complicated than I thought. However, the current manual scaling scattered all over matsim is complicated, too and likely less coherent.

Opinions?

@vsp-gleich
Copy link
Contributor

Currently all analysis code I found either scales with a single scalingFactor for all trips / vehicles considered or does not scale at all. (single factor each in contribs application, noise and accidents)
Some analysis excludes certain vehicle types, e.g. noise excludes bus and bike. The perfect solution there would probably be a scaling factor and a noise rate (a truck is noisier than a car) per vehicle type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo (low priority)
Development

No branches or pull requests

7 participants