We introduce an approach for discriminative clustering that allows one to jointly learn the metric or similarity from unlabeled data (with or without labeled data) and learn the clustering assignment. The metric or similarity is assumed to be specified in a differentiable programming framework, that is, a layer defined as a parameterized functional amenable to automatic differentiation. We define a discriminative clustering layer that enjoys better conditioning than the more commonly used k-means layer. The proposed approach can be used for any amount of labeled or unlabeled data, gracefully adjusting to the amount of supervision. We present experimental results assessing the effectiveness of our method.