Sparse inverse covariance estimation is a popular tool across a variety of scientific disciplines for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes) and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1,000 nodes (24,000 cores), achieving parallel scalability on problems with up to approximately 819 billion parameters (1.28 million dimensions). Even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.