Trying to Contribute to Apache Kafka is a Complete Nightmare

It has now been over three months since I submitted a KIP with a feature complete pull request.

If you want to make a change that requires modification of a public API in Apache Kafka, you have to submit a Kafka Improvement Proposal. This requires getting wiki permissions and drafting your idea in full for submission and discussion on the kafka dev list.

In practice, this effectively requires a nearly complete functional prototype of your idea in order to have a fruitful discussion of the proposed changes. It is unclear to me how "good idea" KIPs would ever make it through the design process.

Here is the still ongoing timeline of attempting to get this feature into the main Kafka project:

  • Early Fall 2021: I identify a need we have for Source Connectors to be able to handle specific Producer write failures. The Connect Framework immediately fails the task with no recourse for connector writers.
  • October 5, 2021: I submit the KIP, PR, JIRA, and dev list discussion email thread.
  • October 11, 2021: No feedback has been given on the KIP or PR. I ping the dev list again.
  • October 18, 2021: Co-worker pings the thread requesting feedback.
  • October 19, 2021: I send a separate email to the dev list outlaying my frustration with this process. A PMC member responds as well as another developer who has open features. Nothing ever comes of this email chain.
  • October 27, 2021: No feedback has been received on my original email. I ping the dev list again.
  • October 28 - Nov 2, 2021: There is some back and forth and I refine the changes.

Minor tangent here, my previous experience with contributing to open source was Apache Nifi. Ironically to update their REST client that talks to Confluent's Schema Registry. We needed to be able to send arbitrary HTTP headers for authentication, etc...

Here is the NiFi timeline:

  • July 2021: I identify a need to have NiFi client talk to Confluent's Schema Registry with specific auth requirements. This requires dynamic headers that the current implemetation does not have the ability to do.
  • July 21, 2021: JIRA and PR submitted.
  • July 22, 2021: Reviewed and merged.

To recap, there was a ~24 hour turnaround from issue and full code submission to merge. That is excellent.

Part of the reason I capitulated on my original design in Kafka was for speed of inclusion into the baseline. Sure, what got crafted out of the dev list might not be my first choice, but having the functionality in the trunk was more important. Had I known it would be three months and counting, I would have not changed any of the originally proposed design. We are using that original proposed design in our custom build.

Back to the Kafka timeline:

  • November 5, 2021: No additional feedback has been received. I ping the dev list with intent to call a vote.
  • November 8, 2021: Vote is called to pass KIP 779.
  • November 11, 2021: Feedback is given about an additional PR that has not been merged for months, but would be affected by semantics in my changes.
  • November 15-16, 2021: Additional dialogue on using an existing configuration item in original discussion thread.
  • November 19, 2021: (11 days on vote) Still do not have enough binding votes, ping vote thread. There is feedback on the PR.
  • November 29, 2021: (21 days on vote) Still do not have enough binding votes, ping vote thread.
  • November 29, 2021: Vote passes later that day with 2 additional binding votes.
  • November 30, 2021: Query vote thread on procedure. Update Wiki.
  • December 6, 2021: I implemented recommendations on the PR.
  • December 10, 2021: No feedback on PR. I ping the original discussion thread.

As of this writing (January 8, 2022), there has been no additional commentary on the pull request, discussion thread, or vote thread.  As of this writing:

95 Days from Original Submission

33 Days from last update to Pull Request with no feedback

Perhaps my NiFi experience spoiled me, but this is completely insane for a project as large and used as Kafka is.  Between the unresponsiveness gatekeeping with KIPs to getting actual feedback to getting people to vote to getting reviews on Pull Requests to actually getting something merged, it has been a painful process every step of the way.

I don't understand why that is. As soon as the KIP was approved, the PR should have been reviewed. If the KIP was not desired it should have been voted down. To quote myself from the KIP process email:

The current process of KIPs needs to be improved. There are at least a handful of open KIPs with existing PRs that are in a purgatory state. I understand that people are busy, but if you are going to gatekeep Kafka with this process, then it must be responsive. Even if the community decides they do not want the change, the KIP should be addressed and closed out.

The entire wiki page is a graveyard of unresponded KIPs. For some changes, it takes a nontrivial amount of effort to put together the wiki page and one has to essentially write the code implementation hoping that it will be pulled into the codebase. This is very frustrating as an external developer to have put in the work and then effectively be ignored.

We have to maintain a custom build because KIPs are not debated, voted on, or merged in a timely manner.

Update:

The PR was finally reviewed and merged on January 27! Once that process started, it was completed very quickly