Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched inference on images using DensePose? #2117

Open
RSKothari opened this issue Oct 9, 2020 · 5 comments
Open

Batched inference on images using DensePose? #2117

RSKothari opened this issue Oct 9, 2020 · 5 comments
Labels
densepose issues specific to densepose

Comments

@RSKothari
Copy link

❓ How to do something using detectron2

Currently, DensePose reads in single images and infer dense annotations. This is very slow and quite wasteful. Does DensePose have the ability to read in batches of images to perform inference?

Describe what you want to do, including:

  1. what inputs you will provide, if any:
    A video filled with images

  2. what outputs you are expecting:
    A pickle file with dense pose annotations, except inferred a lot faster.

❓ What does an API do and how to use it?

Please link to which API or documentation you're asking about from
https://detectron2.readthedocs.io/

NOTE:

  1. Only general answers are provided.
    If you want to ask about "why X did not work", please use the
    Unexpected behaviors issue template.

  2. About how to implement new models / new dataloader / new training logic, etc., check documentation first.

  3. We do not answer general machine learning / computer vision questions that are not specific to detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be used to achieve X.

@MathijsNL
Copy link

MathijsNL commented Oct 9, 2020

Hi there,

This might be a duplicate of Duplicate of #282
I haven't used DensePose myself, but I suppose the usage should be the same as described in the issue

You just need to call model with a batch of inputs

Also there is #1986 which explains how to sort images as well before doing inference. You should be able to work it out with this info, let us know if there is anything unclear.

@ppwwyyxx ppwwyyxx added the densepose issues specific to densepose label Oct 9, 2020
@RSKothari
Copy link
Author

@MathijsNL Thanks, my question however is specific to the DensePose module within Detectron2. It seems it reads in one image after the other to perform inference.

@vkhalidov
Copy link
Contributor

yes, currently DensePose doesn't provide an efficient reader that would batch video inputs. I've got a pending PR to torchvision that addresses this issue.

@mdsrLab
Copy link

mdsrLab commented Aug 8, 2024

For batched input inference, you can make the following change to apply_net.py(InferenceAction class):-

@classmethod
    def execute(cls: type, args: argparse.Namespace):
        batch_size = 16
        logger.info(f"Loading config from {args.cfg}")
        opts = []
        cfg = cls.setup_config(args.cfg, args.model, args, opts)
        logger.info(f"Loading model from {args.model}")
        predictor = DefaultPredictor(cfg)
        logger.info(f"Loading data from {args.input}")
        file_list = cls._get_input_file_list(args.input)
        if len(file_list) == 0:
            logger.warning(f"No input images for {args.input}")
            return
        context = cls.create_context(args, cfg)
        
        for file_batch_ind in range(math.ceil(len(file_list)/batch_size)):
            img_list = []
            for batch_ind in range(batch_size):
                if((batch_size*file_batch_ind + batch_ind) >= len(file_list)):
                    break
                img = read_image(file_list[batch_size*file_batch_ind + batch_ind], format="BGR")  # predictor expects BGR image.
                img_list.append(img)
            with torch.no_grad():
                outputs = predictor(img_list)
                for batch_ind in range(batch_size):
                    if((batch_size*file_batch_ind + batch_ind) >= len(file_list)):
                        break
                    cls.execute_on_outputs(context, {"file_name": file_list[batch_size*file_batch_ind + batch_ind], "image": img_list[batch_ind]}, 
                                           outputs[batch_ind]["instances"])
        cls.postexecute(context)

You would also need to change the call function of the DefaultPredictor class in detectron2/engine/defaults.py

def __call__(self, original_image_list):
        """
        Args:
            original_image (np.ndarray): an image of shape (H, W, C) (in BGR order).

        Returns:
            predictions (dict):
                the output of the model for one image only.
                See :doc:`/tutorials/models` for details about the format.
        """
        with torch.no_grad():  # https://github.com/sphinx-doc/sphinx/issues/4258
            # Apply pre-processing to image.
            if self.input_format == "RGB":
                # whether the model expects BGR inputs or RGB
                original_image = original_image[:, :, ::-1]
            inputList = []
            for original_image in original_image_list:
                height, width = original_image.shape[:2]
                image = self.aug.get_transform(original_image).apply_image(original_image)
                image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
                image.to(self.cfg.MODEL.DEVICE)

                inputs = {"image": image, "height": height, "width": width}
                inputList.append(inputs)

            predictions = self.model(inputList)
            return predictions

I have modified the predictor function to take in a list of images and dump the results in the same format as the sequential image processing.

@matejsuchanek
Copy link

#5330 is dealing with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
densepose issues specific to densepose
Projects
None yet
Development

No branches or pull requests

6 participants