Skip to content

Commit

Permalink
Merge pull request #2 from microsoft/main
Browse files Browse the repository at this point in the history
Update branch with msticpy main
  • Loading branch information
d3vzer0 authored May 5, 2023
2 parents 41f8589 + c0fad01 commit ead4cb5
Show file tree
Hide file tree
Showing 41 changed files with 3,763 additions and 41 deletions.
5 changes: 3 additions & 2 deletions conda/conda-reqs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ azure-mgmt-network>=2.7.0
azure-mgmt-resource>=16.1.0
azure-storage-blob>=12.5.0
beautifulsoup4>=4.0.0
bokeh>=1.4.0, <=3.1.0
bokeh>=1.4.0, <=2.4.3
cryptography>=3.1
deprecated>=1.2.4
dnspython>=2.0.0, <3.0.0
folium>=0.9.0
geoip2>=2.9.0
html5lib
httpx==0.23.3
httpx==0.24.0
ipython>=7.23.1
ipywidgets>=7.4.2, <8.0.0
keyring>=13.2.1
Expand All @@ -30,6 +30,7 @@ msrestazure>=0.6.0
networkx>=2.2
numpy>=1.15.4
pandas>=1.4.0, <2.0.0
panel>=0.14.4
pygments>=2.0.0
pyjwt>=2.3.0
python-dateutil>=2.8.1
Expand Down
1,421 changes: 1,421 additions & 0 deletions docs/notebooks/LocalData-osquery.ipynb

Large diffs are not rendered by default.

459 changes: 459 additions & 0 deletions docs/notebooks/PollingDetection.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ attrs>=18.2.0
cryptography
deprecated>=1.2.4
docutils<0.20.0
httpx==0.23.3
httpx==0.24.0
ipython >= 7.1.1
jinja2<3.2.0
numpy>=1.15.4
Expand Down
1 change: 1 addition & 0 deletions docs/source/DataAcquisition.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Individual Data Environments
data_acquisition/DataProv-Sumologic
data_acquisition/DataProv-Kusto
data_acquisition/DataProv-Cybereason
data_acquisition/DataProv-OSQuery


Built-in Data Queries
Expand Down
7 changes: 7 additions & 0 deletions docs/source/api/msticpy.data.drivers.local_osquery_driver.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
msticpy.data.drivers.local\_osquery\_driver module
==================================================

.. automodule:: msticpy.data.drivers.local_osquery_driver
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/api/msticpy.data.drivers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Submodules
msticpy.data.drivers.kql_driver
msticpy.data.drivers.kusto_driver
msticpy.data.drivers.local_data_driver
msticpy.data.drivers.local_osquery_driver
msticpy.data.drivers.mdatp_driver
msticpy.data.drivers.mordor_driver
msticpy.data.drivers.odata_driver
Expand Down
7 changes: 7 additions & 0 deletions docs/source/api/msticpy.vis.data_viewer_panel.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
msticpy.vis.data\_viewer\_panel module
======================================

.. automodule:: msticpy.vis.data_viewer_panel
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/api/msticpy.vis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Submodules

msticpy.vis.code_view
msticpy.vis.data_viewer
msticpy.vis.data_viewer_panel
msticpy.vis.entity_graph_tools
msticpy.vis.figure_dimension
msticpy.vis.foliummap
Expand Down
174 changes: 174 additions & 0 deletions docs/source/data_acquisition/DataProv-OSQuery.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
The OSQuery provider
====================

:py:mod:`OSQuery driver documentation<msticpy.data.drivers.local_os_query_driver>`

The ``OSQuery`` data provider can read OSQuery log files
and provide convenient query functions for each OSQuery "table"
(or event type) contained in the logs.

The provide can read in one or more log files, or multiple log files
in multiple folders. The files are read, converted to pandas
DataFrames and grouped by table/event. In addition, date fields
within the data are converted to pandas Timestamp format.

.. code::ipython3
qry_prov = mp.QueryProvider("OSQueryLogs", data_paths=["~/my_logs"])
qry_prov.connect()
df_processes = qry_prov.processes()
The query provider query functions will ignore parameters and do
no further filtering. You can use pandas to do additional filtering
and sorting of the data, or use it directly with other MSTICPy
functionality.

OSQuery Configuration
---------------------

You can store your connection details in *msticpyconfig.yaml*,
instead of supplying the ``data_paths`` parameter to
the ``QueryProvider`` class.

For more information on using and configuring *msticpyconfig.yaml* see
:doc:`msticpy Package Configuration <../getting_started/msticpyconfig>`
and :doc:`MSTICPy Settings Editor<../getting_started/SettingsEditor>`

The OSQuery settings in the file should look like the following:

.. code:: yaml
DataProviders:
...
OSQuery:
data_paths:
- /home/user1/sample_data
- /home/shared/sample_data
cache_file: ~/.msticpy/os_query_cache.pkl
The cache_file entry is explained later.

Expected log file format
------------------------

The log file format must be a text file of JSON records. An example
is shown below

.. parsed-literal::
{"name":"pack_osquery-snapshots-pack_python_packages","hostIdentifier":"jumpvm","calendarTime":"Thu Mar 16 09:22:33 2023 UTC","unixTime":1678958553,"epoch":0,"counter":0,"numerics":false,"decorations":{"host_uuid":"40443dd9-5b21-a345-8f89-aadde84c3719","username":"LOGIN"},"columns":{"author":"Python Packaging Authority","directory":"/usr/lib/python3.9/site-packages/","license":"UNKNOWN","name":"setuptools","path":"/usr/lib/python3.9/site-packages/setuptools-50.3.2.dist-info/","summary":"Easily download, build, install, upgrade, and uninstall Python packages","version":"50.3.2"},"action":"snapshot"}
{"name":"pack_osquery-snapshots-pack_dns_resolvers","hostIdentifier":"jumpvm","calendarTime":"Thu Mar 16 13:14:10 2023 UTC","unixTime":1678972450,"epoch":0,"counter":0,"numerics":false,"decorations":{"host_uuid":"40443dd9-5b21-a345-8f89-aadde84c3719","username":"LOGIN"},"columns":{"address":"168.63.129.16","id":"0","netmask":"32","options":"705","type":"nameserver"},"action":"snapshot"}
Each JSON record is expected to have a ``name`` field, identifying
the event type, along with child dictionaries (``columns`` and ``decorations``.

.. code::JSON
{
"name": "pack_osquery-snapshots-pack_dns_resolvers",
"hostIdentifier": "jumpvm",
"calendarTime": "Thu Mar 16 13:14:10 2023 UTC",
"unixTime": 1678972450,
"epoch": 0,
"counter": 0,
"numerics": false,
"decorations": {
"host_uuid": "40443dd9-5b21-a345-8f89-aadde84c3719",
"username": "LOGIN"
},
"columns": {
"address": "u5r0qfkczeeejf3qb20cha0ihb.bx.internal.cloudapp.net",
"id": "0",
"netmask": "",
"options": "705",
"type": "search"
},
"action": "snapshot"
}
Using the OSQuery provider
--------------------------

To use the OSQuery provider you need to create an QueryProvider
instance, passing the string "OSQueryLogs" as the ``data_environment``
parameter. If you have not configured ``data_paths`` in msticpyconfig.yaml,
you also need to add the ``data_paths`` parameter to specify
specific folders or files that you want to read.

.. code::ipython3
qry_prov = mp.QueryProvider("OSQueryLogs", data_paths=["~/my_logs"])
Calling the ``connect`` method triggers the provider to read the
log files.

.. code::ipython3
qry_prov.connect()
.. parsed-literal::
100%|██████████| 2/2 [00:00<00:00, 25.01it/s]
Data loaded.
Listing OSQuery tables
~~~~~~~~~~~~~~~~~~~~~~

.. code:: ipython3
qry_prov.list_queries()
.. parsed-literal::
['osquery.acpi_tables',
'osquery.device_nodes',
'osquery.dns_resolvers',
'osquery.events',
'osquery.fim',
'osquery.last',
'osquery.listening_ports',
'osquery.logged_in_users',
'osquery.mounts',
'osquery.open_sockets',
'osquery.osquery_info',
'osquery.osquery_packs',
'osquery.osquerydb_size',
'osquery.platform_info',
'osquery.process_memory',
'osquery.processes',
'osquery.python_packages',
'osquery.schedule',
'osquery.shell_history']
Running an OSQuery query
~~~~~~~~~~~~~~~~~~~~~~~~

Each query returns the table of event types retrieved
from the logs.

.. code:: python3
qry_prov.osquery.processes()
================================== ================ ========================= ===== ========== ========= ====== ======== ======== ===== ==========
name hostIdentifier unixTime ... username cmdline euid name_ parent uid username
================================== ================ ========================= ===== ========== ========= ====== ======== ======== ===== ==========
pack_osquery-custom-pack_processes jumpvm 2023-03-16 03:08:58+00:00 ... LOGIN 0 kthreadd 2 0 root
pack_osquery-custom-pack_processes jumpvm 2023-03-16 03:08:58+00:00 ... LOGIN 0 kthreadd 2 0 root
pack_osquery-custom-pack_processes jumpvm 2023-03-16 03:08:58+00:00 ... LOGIN 0 kthreadd 2 0 root
pack_osquery-custom-pack_processes jumpvm 2023-03-16 03:08:58+00:00 ... LOGIN 0 kthreadd 2 0 root
pack_osquery-custom-pack_processes jumpvm 2023-03-16 03:08:58+00:00 ... LOGIN 0 kthreadd 2 0 root
================================== ================ ========================= ===== ========== ========= ====== ======== ======== ===== ==========

.. note:: Columns in the the nested log data may be renamed
if their name clashes with an existing name. See the
example ``name_`` in the previous table.

Other OSQuery Provider Documentation
------------------------------------


Built-in :ref:`data_acquisition/DataQueries:Queries for Local Data`.

:py:mod:`LocalData driver API documentation<msticpy.data.drivers.local_os_query_driver>`
18 changes: 15 additions & 3 deletions docs/source/data_acquisition/SentinelIncidents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,25 @@ It is possible to return a list incidents within a workspace, as well as get the
Whilst it is possible to access these incident details via the Incident table in the Workspace, you can also interact
with them via the Microsoft Sentinel APIs which are utilized in these functions.

See :py:meth:`get_incidents <msticpy.context.azure_sentinel.MicrosoftSentinel.list_incidents>`
See :py:meth:`list_incidents <msticpy.context.azure_sentinel.MicrosoftSentinel.list_incidents>`

.. code:: ipython3
sentinel.list_incidents()
This returns a DataFrame with details of incidents.
This returns a DataFrame with details of incidents. By default this will return the 50 latest incidents.
It is possible to pass a set of parameters to `.list_incidents` to adjust the incidents returned via the `params` parameter.
These parameters follow the format of the [Microsoft Sentinel API](https://learn.microsoft.com/rest/api/securityinsights/stable/incidents/list)
and include the following key items:
- $top: this controls how many incidents are returned
- $filter: this accepts an OData query that filters the returned item. https://learn.microsoft.com/graph/filter-query-parameter
- $orderby: this allows for sorting results by a specific column

.. code:: ipython3
# Set parameters to return 500 incidents where the Tile includes 'MSTICPy' and the incidents occurred since a set time
params = {"$top" : 500, "$filter": "contains(properties/title, 'MSTICPy') and properties/createdTimeUtc gt 2023-03-21T12:00:00Z"}}
sentinel.list_incidents(params)
To get details of a single incident you can call `.get_incident` and pass the ID of an incident.
This ID can be found in the name column of the DataFrame returned by `.get_incidents` and appears in the form of a GUID.
Expand All @@ -25,7 +37,7 @@ See :py:meth:`get_incident <msticpy.context.azure.sentinel_core.MicrosoftSentine

.. code:: ipython3
sentinel.get_incidents(incident = "875409ee-9e1e-40f6-b0b8-a38aa64a1d1c")
sentinel.get_incident(incident = "875409ee-9e1e-40f6-b0b8-a38aa64a1d1c")
When calling `get_incident` there are a number of boolean flags you can set to return additional information
related to the incident.
Expand Down
31 changes: 31 additions & 0 deletions docs/source/getting_started/Installing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,34 @@ exception message:
an *ImportError* exception, make sure
that you have installed the *extra* that corresponds to the
functionality you are trying to use.

Installing in Managed Spark compute in Azure Machine Learning Notebooks
^^^^^^^^^^^^^^^^^^^^^^^^^^

*MSTICPy* installation for Managed (Automatic) Spark Compute in Azure Machine Learning workspace requires
different instructions since library installation is different.


.. note:: These notebook requires Azure ML Spark Compute. If you are using it for the first time, follow the guidelines mentioned here :Attach and manage a Synapse Spark pool in Azure Machine Learning (preview):
.. _Attach and manage a Synapse Spark pool in Azure Machine Learning (preview):
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-manage-synapse-spark-pool?tabs=studio-ui

Once you have completed the pre-requisites, you will see AzureML Spark Compute in the dropdown menu for Compute. Select it and run any cell to start Spark Session.
Please refer the docs _Managed (Automatic) Spark compute in Azure Machine Learning Notebooks: for more detailed steps along with screenshots.
.. _Managed (Automatic) Spark compute in Azure Machine Learning Notebooks:
https://learn.microsoft.com/en-us/azure/machine-learning/interactive-data-wrangling-with-apache-spark-azure-ml

In order to install any libraries in Spark compute, you need to use a conda file to configure a Spark session.
Please save below file as conda.yml , check the Upload conda file checkbox. You can modify the version number as needed.
Then, select Browse, and choose the conda file saved earlier with the Spark session configuration you want.

.. code-block:: yaml
name: msticpy
channels:
- defaults
dependencies:
- bokeh
- numpy
- pip:
- msticpy[azure]>=2.3.1
2 changes: 1 addition & 1 deletion msticpy/_version.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
"""Version file."""
VERSION = "2.3.2"
VERSION = "2.4.0"
Loading

0 comments on commit ead4cb5

Please sign in to comment.