Inference Engine Outline

Current Inference Engine Operations

A high level description of the variables (state) that are being estimated, the measurements (signals) used to estimate the state, how those signals relate to the state, and how the estimated state is modeled.

The Estimated Variables

The primary purpose of the IE is to determine which block each vehicle is servicing and the distance along the block.

block: an id of one of the blocks in the GTFS
distance along block (dab): (derived from block and time) the distance in meters that a vehicle has traveled along that block.
orientation (derived from block and time)
location (lat/lng) (derived for block and time)
journey state (can be derived from block and time)

The Measurements (Signals)

Destination Sign Code

Uses: DSC

Use the DSC to determine if the particle is a good estimate of the state. Code Here

Each block has a set of trips that it serves. Vehicles on those trips will display a sign code. If that sign code matches a trip served by particle's block, that is an indication that the particle is a good estimate.

-----Discuss addition weightings related i.s., o.o.s, etc.

Edge

Uses: Lat/Lng, Orientation

Basically, the particle has a guess as to what edge it is on, where the edge is either the straight line distance between two stops (or the straight line distance between to points in a shape file). If you are traveling along this edge, you will have an expected average velocity and an expected heading. Does the velocity of the vehicle and the heading of the vehicle match what is expected?

Block

Uses: block_id. This likelihood is made possible by the pull-in/pull-out service. The block id for each vehicle can be retrieved from that service. This can be a huge weighting since the block is being explicitly defined.

Is block id occasionally returned in the observation? In the example trace (4855), that I have been using, this never happens.

Gps

Uses: Lat/Lng

Looking at the estimated block and distance into the block, you can arrive at a GPS coordinate. Look at the GPS coordinate returned in the observation and us this to determine if the particle closely matches. 'Closely matching' is based on far away (in meters) the observed lat/lng is from the particles guess at the lat/lng. The weight is derived from observing where the observed point falls in a 'folded pdf' with a known mean and standard deviation.

Moved

Uses: Lat/Lng Is the vehicle schedule to move or is the vehicle scheduled to not be moving right now. (Double check this, not sure if this is the correct interpretation). If that matches the observation, this is particle might be a good match.

NullLocation

If the state of the particle is anything except in-progress, the weight is being lowered. Not sure why.

NullState

Very similar to NullLocation, return a very lower number if the vehicle is DEADHEAD_AFTER, AT_BASE, DEADHEAD_BEFORE, or LAYOVER_BEFORE AND has zero schedule deviation.

If the vehicle has no schedule deviation, return a low number. If the schedule is NOT in one of the states above and the schedule deviation is non-zero, return a high number.

Run

Has the Op (operator?) assigned a run to this vehicle? The logic does not seem to concern the particle at all, only the observation. I must be missing something there.

Modeling the change in estimated variables over time

Distance Along Block

Potential Enhancements

Potential new Signals

Pull-In and Pull-Out Service

The Pull-In/Pull-Out service will assign a block id (assignedBlockId) to the vehicle instance.

Stalled Status

Potential New Likelihood

Each block contains a set of trips in the STIF. This includes deadhead trips. Deadhead trips have a known origin, destination, start time, and end time. Since the deadhead does not have a route, Edge likelihood will no longer be a good estimate.

Automated Particle Weighting

Are the current weight assignments optimal? Can they be improved with an automated process? The weights inside each likelihood are hardcoded. E.g., the weight in the DscLikelihood for matching a DSC while in service is 13/30. How as that number derived? Would it be possible to make these variables and run several trials of different weights to see if other numbers lead to better results. Something like particle swarm optimization could be used to close in on the optimal weights.

Measuring Improvements

If any changes are made to the inference engine, then we will need to measure how those changes affect matching vehicles to blocks and whether or not improved matching leads to improved predictions downstream of the IE.

How will we know if the changes have lead to improved matching?

Using non-revenue data to improve the performance of the IE should lead to better matching at the start of trips or on trips that have yet to begin. One measure of success will be to determine if the improvements lead to earlier matching. For example, if the current IE matches a vehicle to a trip after 3 or 4 observations, we can call the improvements a success if they are matching after only 1 or 2 observations, or even matching before any observations have been recorded on the trip.

In order to make these comparisons, a representative set of traces need to be found. This set of traces should include:

traces with good matching along the entire length of the trip
traces for trips that include deadhead during where matching is less successful immediately after the deadhead period
traces for trips that do not match initially due to a layover or deadhead before the trip begins

Improvements to the IE should increase successful matching on trips that include deadhead before/during, or layover before/during. The IE should not affect matching on trips that already match successfully for the full length of the trip.

How will we know that the improved matching leads to improved predictions?

The prediction algorithms exists outside of the inference engine. The goal is the proved the prediction algorithm with better matching. With better matching (and earlier matching), . . .

Index

All Current Measurements/Inputs

"RealtimeEnvelope":{ 
    "UUID":"alphanumericstrring",
    "timeReceived": 1538438400629,
    "UUID":"alphanumbericstring",
    "timeReceived":1538438400528,
    "CcLocationReport":{
        "request-id":1808,
        "vehicle":{
            "vehicle-id":297,
            "agency-id":2008,
            "agencydesignator":"MTA NYCT"
        },
        "status-info":0,
        "time-reported":"2018-10-02T23:59:57.0-00:00",
        "latitude":40836203,
        "longitude":-73879575,
        "direction":{"deg":203.51},
        "speed":30,
        "manufacturer-data":"alphanumericstring",
        "operatorID":{
            "operator-id":0,
            "designator":"0"
        },
        "runID":{
          "run-id":0,
          "designator":"000"
        },
       "destSignCode":6,
       "routeID":{
           "route-id":0,
           "route-designator":"0"
       },
       "localCcLocationReport":{
           "NMEA":{
               "sentence":["$GPGGA,235957.000,4050.17222,N,07352.77448,W,1,10,01.0,+00014.0,M,,M,,*4F","$GPRMC,235957.00,A,4050.172222,N,07352.774480,W,000.000,203.51,021018,,,A*70"]
           },
          "vehiclepowerstate":1
       }
   }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly