Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual Basic Information Parsing, PCode detection #440

Merged
merged 33 commits into from
Mar 5, 2019

Conversation

JakubPruzinec
Copy link
Contributor

To be completed.

@s3rvac s3rvac requested a review from metthal December 17, 2018 13:30
@JakubPruzinec JakubPruzinec changed the title Visual Basic Information Parsing, PCode detection (IN PROGRESS) Visual Basic Information Parsing, PCode detection (READY TO REVIEW) Feb 11, 2019
@JakubPruzinec
Copy link
Contributor Author

I've implemented extraction of some VB metada features:

  • project info
  • code info
  • object table dump
  • extern table dump
  • guids, COM clsids and other COM related info
    and so on.

#138

Copy link
Member

@metthal metthal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments into your changes. Please have a look at them.


}

std::size_t headerSize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be static?

std::uint32_t helpFileOffset; ///< offset to the string containing name of the Help file
std::uint32_t projNameOffset; ///< offset to the string containing project's name

VBHeader()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's dangerous to leave the member variables uninitialized.

@@ -300,6 +300,7 @@ void double10ToDouble8(std::vector<unsigned char> &dest,

unsigned short byteSwap16(unsigned short val);
unsigned int byteSwap32(unsigned int val);
unsigned long byteSwap64(unsigned long val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned long is not 64-bit type when using MSVC.

return;
}

vbh.signature = *reinterpret_cast<std::uint32_t *>(&bytes.data()[offset]); offset += sizeof(vbh.signature);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is recurring in your changes. I was thinking whether it would be possible for you to either create a function that would do this or even better, move DynamicBuffer from src/unpacker to utils and use it. It basically does what you do but it abstracts away all the casting and also maintains endianess, so you don't have to do byte swaps afterwards.

@@ -781,6 +781,345 @@ bool FileInformation::hasRichHeaderRecords() const
return richHeader.hasRecords();
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these whitespaces

std::string result;
if (!bytes)
{
return "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace "" with {}. It is much more efficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
else
{
const std::size_t maxC = std::pow(2, sizeof(std::string::value_type) * CHAR_BIT) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are doing power of 2 then it's much better to use bit shift rather than std::pow().

Copy link
Member

@metthal metthal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking your time and going through our comments. I just have a few other things. One is more serious and that is using namespace in the header files on multiple places. That is considered a bad practice. The other comment is just about the way you calculate the power of 2, because it's unusual that it is calculated from 2 and not from 1.

#include "retdec/unpacker/dynamic_buffer.h"
#include "retdec/utils/dynamic_buffer.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not put using namespace in header files. By including this file, you bring the whole namespace into all included files and this can cause you a headache or two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include "retdec/unpacker/plugin.h"

#define mpress_plugin plugin(retdec::unpackertool::mpress::MpressPlugin)

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

@@ -12,11 +12,9 @@

#include "retdec/unpacker/decompression/compressed_data.h"

namespace retdec {
using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

@@ -8,6 +8,8 @@

#include "unpackertool/plugins/upx/decompressors/decompressor_scrambler.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

@@ -8,6 +8,8 @@

#include "unpackertool/plugins/upx/decompressors/decompressor.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

#include "retdec/unpacker/signature.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

#include "retdec/unpacker/dynamic_buffer.h"
#include "retdec/utils/dynamic_buffer.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

#include "retdec/unpacker/unpacking_stub.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

@@ -12,6 +12,8 @@
#include "unpackertool/plugins/upx/upx_stub.h"
#include "retdec/unpacker/signature.h"

using namespace retdec::utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment I made earlier. Never put using namespace in header files.

@@ -49,7 +49,7 @@ bool isNonasciiChar(unsigned char c) {
*/
std::string replaceChars(const std::string &str, bool (* predicate)(unsigned char)) {
std::stringstream result;
const std::size_t maxC = std::pow(2, sizeof(std::string::value_type) * CHAR_BIT) - 1;
const std::size_t maxC = (2 << (sizeof(std::string::value_type) * CHAR_BIT - 1)) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's usually done by shifting 1 and not 2. Basically

1 << 0 = 2^0 = 1
1 << 1 = 2^1 = 2
1 << 2 = 2^2 = 4
and so on...

I guess that's why you put - 1 after CHAR_BIT there but I think that shifting 1 would be much easier to grasp at the first sight as it is the usual way to calculate powers of 2.

@s3rvac s3rvac changed the title Visual Basic Information Parsing, PCode detection (READY TO REVIEW) Visual Basic Information Parsing, PCode detection Feb 25, 2019
@@ -8,7 +8,7 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current Windows TeamCity build fails with a lot of the following errors:

[21:43:46] :	 [Step 1/6]     68>e:\work\414c7650a8fb97ac\retdec\src\unpackertool\plugins\upx\unfilter.cpp(88): error C2039: 'min': is not a member of 'std' [E:\work\414c7650a8fb97ac\retdec\build\src\unpackertool\plugins\upx\retdec-unpacker-upx.vcxproj]
[21:43:46] :	 [Step 1/6]     68>e:\work\414c7650a8fb97ac\retdec\src\unpackertool\plugins\upx\unfilter.cpp(130): error C2039: 'min': is not a member of 'std' [E:\work\414c7650a8fb97ac\retdec\build\src\unpackertool\plugins\upx\retdec-unpacker-upx.vcxproj]
...

I believe that you are missing #include <algorithm> in the file (this is where std::min() is defined). Alternatively, if the file transitively includes windows.h, we will probably need to compile with NOMINMAX.

return false;
}

DynamicBuffer structContent(bytes, retdec::utils::Endianness::LITTLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't endianness be the endianness of the file itself? This seems like it forces little endian on the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metthal metthal merged commit 961f966 into avast:master Mar 5, 2019
s3rvac added a commit that referenced this pull request Mar 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants