I have been reading about papers on application of deep learning to drug discovery. So far, I have seen a number of protein (and ligand) representations used:
1- a grid based representation, which is very straightforward, where a grid is placed on top of the protein or the protein-ligand complex, with a certain resolution such as 1 Angstrom, and then the effect of all or some atoms of the ligand and the protein are reflected on the grid cell positions. This approach is not rotation and translation invariant in the sense that the orientation of the protein changes how it is represented into the Deep Learning application and this may presumably change the resulting computation. This is similar to how the rotation and translation of a face in an image may skew the recognition of that face.
This representation has been used for pose prediction and virtual screening, e.g. in Prof. Koes’ from U. Pittsburgh’s papers.
2- a distance matrix based representation, where the pairwise distances between e.g. all C-alpha atoms are represented. This representation has the advantage of being rotation and translation invariant.
This representation has been used for classifying predicted protein structures against CASP targets, e.g. in .
3- a graph based representation, where the nodes are the atoms and the bonds are the edges of this graph. A drawback of such an approach is that it does not find the spatial neighborhood information if only the graph structure is taken into account.
This representation has been used in predicting toxicity properties of ligands in the graph convolutional neural network paper by Duvenaud et al .
4- some approximate representation called ACNN (for atomic convolutional neural networks) where only the atoms within 12 angstrom of a center atom are considered, and furthermore these atoms are “pooled” together.
This representation has been used for predicting free energy of a ligand-protein complex in the paper by Pande and his coworkers, although the predictions suggest that the system is heavily overfitting.
5- A topological representation  where the barcode of the protein is obtained through persistence (i.e. the Betti numbers) and these are discretized to represent the protein. Such a representation is also translation and rotation invariant and is not too sensitive to the fine details of atomic coordinates which are subject to error due to the experimental error.
6- I have not seen a paper on this yet but a point cloud representation of a protein structure is also feasible, based although it also does not take into account the bond structure of a protein. PointNet  could be used for this purpose.
This has been used in classification and regression tasks such as scoring protein structure prediction candidates in CASP competition  or the prediction of binding affinity in the following papers.
 Deep convolutional networks for quality assessment of protein folds. Georgy Derevyanko, Sergei Grudinin, Yoshua Bengio, and Guillaume Lamoureux . arXiv:1801.06252v1 [q-bio.BM] 18 Jan 2018
 TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions Zixuan Cang, and Guo-Wei Wei. arXiv:1704.00063v1 [q-bio.QM] 31 Mar 2017.
 PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas. arXiv:1612.00593v2 [cs.CV] 10 Apr 2017
 Convolutional networks based on graphs for learning molecular fingerprints. NeurIPS 2015. Duvenaud et al. 2015.