Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux: update maple tree extension to fix issue #1032 #1034

Conversation

eve-mem
Copy link
Contributor

@eve-mem eve-mem commented Nov 10, 2023

Hello,

While debugging the maple tree parsing to see what has caused this issue #1032 for @4n6-fl I realized how I could recreate the issue and fix it.

For some reason that I don't understand (and I would like to understand so, if this makes sense to you please would you team me) if the get_slot_iter function is called twice it will fail, the seen set still contains values when the _parse_maple_tree_node function starts. It's supposed to be an optional parameter with the default set to an empty set - so I'm not really sure what is going on. It's likely some python I don't understand.

To recreate the issue it is as simple as opening volshell and trying to to parse the maple tree twice.

(layer_name) >>> [ vma for  vma in ps()[-2].mm.mm_mt.get_slot_iter()]
[154723148639504, 154723148637832, 154723148638592, 154723148639352, 154723353319200, 154723148638744, 154723353319352, 154723353318592, 154723353320872, 154723353318896, 154723353317984, 154723353317680, 154723353317832, 154723353321176, 154723353318744, 154723353318440, 154723148638440, 154723353320720, 154723353319504, 154723148637224, 154723353317376, 154723353319656, 154723353319048, 154723148638896, 154723148636312, 154723148637984, 154723148636160, 154723353318288, 154723148639200, 154723148637376, 154723148639808, 154723148638136, 154723353317528, 154723148636464, 154723353321024, 154723148639048, 154723148636920, 154723148639960, 154723148636616, 154723353320112, 154723148636768, 154723148638288, 154723148637528, 154723148637680]
(layer_name) >>> [ vma for  vma in ps()[-2].mm.mm_mt.get_slot_iter()]
WARNING  volatility3.framework.symbols.linux.extensions: The mte 0x8cb84e6c971e has all ready been seen, no further results will be produced for this node.
WARNING  volatility3.framework.symbols.linux.extensions: The mte 0x8cb84e6c971e has all ready been seen, no further results will be produced for this node.

It is the double parsing that causes the bash plugin to have this issue. It does so here https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/bash.py#L82 and https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/bash.py#L92.

This PR changes the call to _parse_maple_tree_node to actually specify the empty set - which seems to do the trick as seen here:

(layer_name) >>> [ vma for  vma in ps()[-2].mm.mm_mt.get_slot_iter()]
[154723148639504, 154723148637832, 154723148638592, 154723148639352, 154723353319200, 154723148638744, 154723353319352, 154723353318592, 154723353320872, 154723353318896, 154723353317984, 154723353317680, 154723353317832, 154723353321176, 154723353318744, 154723353318440, 154723148638440, 154723353320720, 154723353319504, 154723148637224, 154723353317376, 154723353319656, 154723353319048, 154723148638896, 154723148636312, 154723148637984, 154723148636160, 154723353318288, 154723148639200, 154723148637376, 154723148639808, 154723148638136, 154723353317528, 154723148636464, 154723353321024, 154723148639048, 154723148636920, 154723148639960, 154723148636616, 154723353320112, 154723148636768, 154723148638288, 154723148637528, 154723148637680]
(layer_name) >>> [ vma for  vma in ps()[-2].mm.mm_mt.get_slot_iter()]
[154723148639504, 154723148637832, 154723148638592, 154723148639352, 154723353319200, 154723148638744, 154723353319352, 154723353318592, 154723353320872, 154723353318896, 154723353317984, 154723353317680, 154723353317832, 154723353321176, 154723353318744, 154723353318440, 154723148638440, 154723353320720, 154723353319504, 154723148637224, 154723353317376, 154723353319656, 154723353319048, 154723148638896, 154723148636312, 154723148637984, 154723148636160, 154723353318288, 154723148639200, 154723148637376, 154723148639808, 154723148638136, 154723353317528, 154723148636464, 154723353321024, 154723148639048, 154723148636920, 154723148639960, 154723148636616, 154723353320112, 154723148636768, 154723148638288, 154723148637528, 154723148637680]

Now I rolled back to 804d68d from #996 to see if this has always been like that, and yes. Calling it twice on that commit still caused it to error. I would have tested it at the time, and alishir on slack also said this patch fixed there issue so I'm not exactly sure what's happened. I can only imagine I made some mistake with my testing, or updating python has changed something?

Here is the output showing this fixes the issue with the bash plugin:

$ python vol.py -f v6.1.15.dmp linux.bash
Volatility 3 Framework 2.5.2
Progress:  100.00               Stacking attempts finished
PID     Process CommandTime     Command

826     bash    2023-03-11 20:09:54.000000      ssh root@10.10.10.2 mkdir -p /mnt/data/$VERSION
826     bash    2023-03-11 20:09:54.000000      ssh-copy-id root@10.10.10.2
826     bash    2023-03-11 20:09:54.000000      apt update
826     bash    2023-03-11 20:09:54.000000      sudo apt update
826     bash    2023-03-11 20:09:54.000000      uname -a
<SNIP>

@eve-mem
Copy link
Contributor Author

eve-mem commented Nov 30, 2023

Hello,

I've been digging in to work out what is going on here that caused the issue, and it is very much a python thing rather than anything at all to do with vol that I didn't understand at the time.

It's explained here - and just something I didn't understand - https://rules.sonarsource.com/python/RSPEC-5717/

It's very obvious with this example.

def growing_string(default_list=[]):
    default_list.append('A')
    print(default_list)

Each call to growing_string without providing a parameter shows the list growing, while calling it with a parameter doesn't. default list is made when the function is called the first time, and reused with the later calls.

>>> growing_string()
['A']
>>> growing_string()
['A', 'A']
>>> growing_string()
['A', 'A', 'A']
>>> growing_string([]) 
['A']
>>> growing_string([])
['A']
>>> growing_string([])
['A']

Take this further snippet, which doesn't use any parts of vol, but I try to mimic the maple tree parsing. To show why this affects the parsing.

import random

def parent(seen=set()):
    print('starting_with_seen:', seen)
    yield from child(seen)

def child(seen, depth=0):
    print('child_started_with:', seen)
    rand_int = random.randint(0,100)
    if rand_int in seen:
        return None
    else:
        seen.add(rand_int)
        yield (rand_int, depth)
        if random.random() < 0.6:
            yield from child(seen, depth+1)

If you call parent it will yield from child a few random numbers, but never the same number twice as this seen set is passed around. Just like how the maple tree parsing works.

The default value for the seen set in parent is an empty set, so I had falsely assumed that each time you call parent without specifying seen that the default would be used - e.g. an empty set.

However that is provably not true, notice how each time parent is called here

>>> _ = [ print(val) for val in parent() ]
starting_with_seen: set()
child_started_with: set()
(24, 0)
>>>
>>> _ = [ print(val) for val in parent() ]
starting_with_seen: {24}
child_started_with: {24}
(28, 0)
>>> _ = [ print(val) for val in parent() ]
starting_with_seen: {24, 28}
child_started_with: {24, 28}
(90, 0)
child_started_with: {24, 90, 28}
(5, 1)
>>>
>>> _ = [ print(val) for val in parent() ]
starting_with_seen: {24, 90, 28, 5}
child_started_with: {24, 90, 28, 5}
(21, 0)
child_started_with: {5, 21, 24, 90, 28}
(94, 1)
child_started_with: {5, 21, 24, 90, 28, 94}

The starting_with_seen prints show that this set is growing each time parent is called without providing a new value for the the seen variable.

If you call this but specify that an empty set should be used it works as expected. starting_with_seen shows an empty set.

>>> _ = [ print(val) for val in parent(set()) ]
starting_with_seen: set()
child_started_with: set()
(65, 0)
child_started_with: {65}
(57, 1)
>>>
>>> _ = [ print(val) for val in parent(set()) ]
starting_with_seen: set()
child_started_with: set()
(66, 0)
child_started_with: {66}
(21, 1)
child_started_with: {66, 21}
(73, 2)
child_started_with: {73, 66, 21}
(84, 3)
child_started_with: {73, 66, 84, 21}
(36, 4)

That is why the maple tree parsing wouldn't work previously when called twice - the seen set was just growing. I thought it would start as an empty set, but that is not true. For the maple tree parsing to work that set must start as an empty set - else it will think it's already seen the root node and stop.

@4n6-fl
Copy link

4n6-fl commented Nov 30, 2023

Thanks !

@ikelos
Copy link
Member

ikelos commented Nov 30, 2023

Ah good spot! I'm usually pretty good about spotting so-called mutable types in default parameters, but that one snuck by. The solution is to default to None and then at the very start, detect if it's None and make it an empty set if so. None isn't mutable and will barf if it isn't changed when someone tries to use it as a set, so it acts as a good way to signify the need for initialization.

I think that'd be a better long term fix, rather than just making sure that this one input is correct but leaving the problem open for others in the future.

@ikelos ikelos linked an issue Nov 30, 2023 that may be closed by this pull request
@atcuno
Copy link
Contributor

atcuno commented Nov 30, 2023

I believe the cleanest way to do this is make the current _parse_maple_tree_node into an internal function named something like _do_parse_maple_tree_node (the "_do" part being a Linux kernel convention). Then, _parse_maple_tree_node would call the _do version with an empty set and callers of _parse_maple_tree_node would not been concerned with seen. Since the function is recursive, having it initialized in the recursive function is going to make where different entry points won't have the full set of previously seen nodes.

@ikelos
Copy link
Member

ikelos commented Nov 30, 2023

But if it's recursive, surely the call inside the function will populate that parameter, and it won't be an issue anyway? There shouldn't be a call where a value is implicitly expected rather than explicitly being passed in?

@eve-mem
Copy link
Contributor Author

eve-mem commented Dec 1, 2023

Hello @ikelos and @atcuno - thank you for taking a look at this.

I've updated it now to set the default of 'seen' to be None. Then when it's None (e.g. the first call into it) a new empty set is created. I've added probably a very long and unneeded comment to explain why you might want to call the function but actually provide a set rather than allowing it to be none.

I'd imagine most use would actually come via the get_slot_iter function which then calls out to _parse_maple_tree_node to get all the nodes.

I would like to add more to the extension in the future so that the maple tree can be queried as it can in the linux kernel. e.g. given an address find the matching slot by walking the tree and comparing the pivots. I think that would be useful functionality for none mm struct uses of maple trees (not that I've come across that yet myself...).

Let me know what you think about the changes. Thanks again for all your help!

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, and I'm happy with long comments that clarify issues that took us time to figure out in the past. 5;)

@ikelos ikelos merged commit 21af01d into volatilityfoundation:develop Dec 1, 2023
14 checks passed
@eve-mem eve-mem deleted the linux_maple_tree_seen_parameter_fix branch December 1, 2023 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Volatility3 linux.Bash plugin not working
4 participants