GitLab日志获取代码行数及数据缺失问题(python版)

首先，通过下列python代码循环遍历获取提交次数

def cleaning_data(ip, token):
    dict_out = {}
    page1 = 1
    list1_all = []
    list1 = [{"name1": "new"}]
    while list1:
        url1 = '%s/api/v4/projects/?private_token=%s&per_page=100&page=%s' % (ip, token, page1)
        # print(url1)
        r1 = requests.get(url1)
        list1 = json.loads(r1.text)
        list1_all += list1
        page1 += 1
        # print(list1_all)

    for i in list1_all:
        project_id = i['id']
        namespace = '.'.join(i['path_with_namespace'].split('/'))
        page2 = 1
        list2_all = []
        list2 = [{"name2": "new"}]
        while list2:
            url2 = '%s/api/v4/projects/%s/repository/branches?private_token=%s&per_page=100&page=%s' % (
                ip, project_id, token, page2)
            r2 = requests.get(url2)
            list2 = json.loads(r2.text)
            list2_all += list2
            page2 += 1
        # print(list2_all)

        for j in list2_all:
            if namespace in branch_dict.keys():
                branch = branch_dict[namespace]
            else:
                branch = 'develop'
            if j['name'] == branch:
                page3 = 1
                list3_all = []
                list3 = [{"name3": "new"}]
                while list3:
                    url3 = '%s/api/v4/projects/%s/repository/commits?ref_name=%s&private_token=%s&per_page=100&page=%s' % (
                        ip, project_id, branch, token, page3)
                    if namespace in ['vbs.postloan.public-service.AnalysisDeductConfigWebApi',
                                     'vbs.postloan.public-service.rate-platform',
                                     'vbs.postloan.public-service.corporate-account-test-service']:
                        print(url3)
                        print("要获取的项目：" + namespace)
                        print("要获取的分支：" + branch)
                    r3 = requests.get(url3)
                    list3 = json.loads(r3.text)
                    list3_all += list3
                    page3 += 1

                dict4 = {}
                for k in list3_all:
                    # print(k)
                    url4 = '%s/api/v4/projects/%s/repository/commits/%s?private_token=%s&per_page=100' % (
                        ip, project_id, k['id'], token)
                    r4 = requests.get(url4)
                    commit_json = json.loads(r4.text)
                    # if "Merge" not in (commit_json['message'] + commit_json['title']):
                    dt = commit_json['committed_date']
                    dict4[dt] = commit_json['stats']['total']
                # print(dict3)
                """把{时间：次数}字典添加到dic_out"""
                if namespace in dict_out.keys():
                    dict_out[namespace].update(dict4)
                else:
                    dict_out[namespace] = dict4

    # print(dict_out)
    return dict_out

View Code

实际使用下来，发现项目及分支缺失，查阅资料

参考：https://blog.csdn.net/a64910807/article/details/102162256?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param

需要在url后面指定分页显示数 per_page=num

例子：https://git.kkcredit.cn/api/v4/projects/?private_token=sHdPSukqHVZ8zk-ntDYe&per_page=100

后面的url也参照使用这种方式

使用下来发现每页最大是100，即使写100000也没用，这时候还要用 page=num

例子：https://git.kkcredit.cn/api/v4/projects/?private_token=sHdPSukqHVZ8zk-ntDYe&per_page=100&page=2

另外一定要用 ref_name=branch_name 来选择分支

调用GitLab日志参考：https://blog.csdn.net/wenwen513/article/details/95647364

项目分支缺失参考：https://www.oschina.net/question/2420680_2290255

posted @ 2020-08-19 10:52 carlvine 阅读(644) 评论(0) 收藏举报

刷新页面返回顶部

carlvine

GitLab日志获取代码行数及数据缺失问题(python版)

公告