Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taint analysis in Double-Precision Floating-Point instruction #471

Closed
pfsun opened this issue Jan 18, 2017 · 14 comments
Closed

Taint analysis in Double-Precision Floating-Point instruction #471

pfsun opened this issue Jan 18, 2017 · 14 comments

Comments

@pfsun
Copy link

pfsun commented Jan 18, 2017

@JonathanSalwan For taint analysis in double-precisson floating-point instruction (e.g. subsd, mulsd, addsd, divsd), I must firstly provide the correct semantics, and then I can do taint analysis in these instructions, right?

@JonathanSalwan
Copy link
Owner

Right but i'm very curious about how you will provide the good semantics :). If you have ideas go ahead!

@pfsun
Copy link
Author

pfsun commented Jan 18, 2017

Actually, I don't have good idea now. I just want to confirm with you firstly. And then exploring and trying the possible way. I also wonder whether you have any tips to share with me? I see you have build the semantic for movsd.

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Jan 18, 2017

actually, if you only want the taint analysis you can only call taintAssignement, taintUnion etc... But if you want the symbolic execution you have to define the semantics. The problem is described here.

@pfsun
Copy link
Author

pfsun commented Jan 18, 2017

I know the problem for symbolic execution. Currently, I just need to do taint and no symbolic execution. I will try taintAssignment and taintUnion. Thanks Jonathan.

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Jan 18, 2017

Okay. Below an example for the addsd tainting:

+      void x86Semantics::addsd_s(triton::arch::Instruction& inst) {
+        auto& dst = inst.operands[0];
+        auto& src = inst.operands[1];
+
+        /* Spread taint */
+        this->taintEngine->taintUnion(dst, src);
+
+        /* Upate the symbolic control flow */
+        this->controlFlow_s(inst);
+      }
+
+

@pfsun
Copy link
Author

pfsun commented Jan 18, 2017

Great. I just go to one class. I will try them and then let you know. But it looks that isTainted() also need to check the symbolic expression, and need the semantic, right?

@JonathanSalwan
Copy link
Owner

Indeed. You have to define instruction->tainted to true. There is probably a patch to do out there.

@JonathanSalwan
Copy link
Owner

Please git pull origin dev-next. Then:

+      void x86Semantics::addsd_s(triton::arch::Instruction& inst) {
+        auto& dst = inst.operands[0];
+        auto& src = inst.operands[1];
+
+        /* Spread taint */
+        inst.setTaint(this->taintEngine->taintUnion(dst, src));
+
+        /* Upate the symbolic control flow */
+        this->controlFlow_s(inst);
+      }
+
+

@pfsun
Copy link
Author

pfsun commented Jan 18, 2017

The taint works prefect now. Thanks for you help.

@pfsun
Copy link
Author

pfsun commented Jan 18, 2017

Thanks Jonathan. Now I can taint the four float-point instruction. One more thing, I don't need the symbolic execution for float-point instruction. But I do need the semantic expression for them. Do you plan to add them in some point? I am also exploring them now.

@JonathanSalwan
Copy link
Owner

Do you plan to add them in some point?

It's a long term goal but it's a bit complex to do and I have so much easy things to do before :)

@pfsun
Copy link
Author

pfsun commented Feb 17, 2017

Hi @JonathanSalwan, now I start to explore SUBSD xmm1, xmm2/m64. I stopped on SUBSD xmm1, m64.
e.g.
4050a0: subsd xmm0, qword ptr [rbp - 0x60]
ref_87 = (ref_85 - ref_31) # SUBSD operation //Simplified Symbolic Expression
SymVar_0:128 for xmm0, SymVar_1:64 for qword ptr [rbp - 0x60]
I convert Xmm0 to symbolic variable SymVar_0, and convert memory to symbolic variable SymVar_1.
When I try to get the FullAst, I only can get SymVar_0, and it misses SymVar_1. Because the correct output should be SymVar_0 - SymVar_1. The reason should be different size problem for xmm1 and m64. When I call getFullAst() to get full ast, it cannot output the correct symbolic expression. How can I avoid the problem? Thanks.

@JonathanSalwan
Copy link
Owner

Past your code.

@pfsun
Copy link
Author

pfsun commented Feb 18, 2017

Now for SUBSD instruction, I just simply copy the code from sub_s and also disable the the code which is to decide whether the two node is the same size in BvsubNode::init. I know subsd_s should be different with sub_s. I just try firstly and will change it based on learning.

 void subsd_s(triton::arch::Instruction& inst) {
          auto& dst = inst.operands[0];
          auto& src = inst.operands[1];
          //auto bvSize = src.getBitSize();

          /* Create symbolic operands */
          auto op1 = triton::api.buildSymbolicOperand(inst, dst);
          auto op2 = triton::api.buildSymbolicOperand(inst, src);

          /* Create the semantics */
          auto node = triton::ast::bvsub(op1, op2);

          /* Create symbolic expression */
          auto expr = triton::api.createSymbolicExpression(inst, node, dst, "SUBSD operation");

          /* Spread taint */
          expr->isTainted = triton::api.taintUnion(dst, src);

          /* Upate symbolic flags */
          triton::arch::x86::semantics::af_s(inst, expr, dst, op1, op2);
          triton::arch::x86::semantics::cfSub_s(inst, expr, dst, op1, op2);
          triton::arch::x86::semantics::ofSub_s(inst, expr, dst, op1, op2);
          triton::arch::x86::semantics::pf_s(inst, expr, dst);
          triton::arch::x86::semantics::sf_s(inst, expr, dst);
          triton::arch::x86::semantics::zf_s(inst, expr, dst);

          /* Upate the symbolic control flow */
          triton::arch::x86::semantics::controlFlow_s(inst);
        }

For instruction: 4050a0: subsd xmm0, qword ptr [rbp - 0x60]

In my script, for insertCall(cbefore, INSERT_POINT.BEFORE), I do the following:

for expr in inst.getSymbolicExpressions():
        print expr
        print "AST", expr.getAst()
    print
if inst.getAddress() == 0x4050a0:
        var1 = convertRegisterToSymbolicVariable(REG.XMM0)
        print var1  ### SymVar_0:128
        memaddress = inst.getOperands()[1].getAddress()
        var2 = convertMemoryToSymbolicVariable(MemoryAccess(memaddress,  CPUSIZE.QWORD))
        print var2  ### SymVar_1:64

The result looks good:

**ref!87** = ((_ zero_extend 0) (bvsub ((_ extract 127 0) ref!85) (concat ((_ extract 7 0) ref!31) ((_ extract 7 0) ref!32) ((_ extract 7 0) ref!33) ((_ extract 7 0) ref!34) ((_ extract 7 0) ref!35) ((_ extract 7 0) ref!36) ((_ extract 7 0) ref!37) ((_ extract 7 0) ref!38)))) ; SUBSD operation
**AST** ((_ zero_extend 0) (bvsub ((_ extract 127 0) ref!85) (concat ((_ extract 7 0) ref!31) ((_ extract 7 0) ref!32) ((_ extract 7 0) ref!33) ((_ extract 7 0) ref!34) ((_ extract 7 0) ref!35) ((_ extract 7 0) ref!36) ((_ extract 7 0) ref!37) ((_ extract 7 0) ref!38))))

In my script, for insertCall(cafter, INSERT_POINT.AFTER), I also output the expr and getAst()
The result is:

ref!87 = SymVar_0 ; SUBSD operation
SymVar_0

Based on the result, I think whether the problem may be Symbolic simplification callback.
createSymbolicExpression -> createSymbolicRegisterExpression -> newSymbolicExpression -> processSimplification.

If you need my script and test binary, I can send them to you later. But I think the main problem is for symbolic. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants